International Workshop
Behaviour Analysis and Video Understanding
in conjunction with
ICVS 2011
23 September 2011, Sophia Antipolis (France)
    ICVS 2011


WELCOME to the behaviour workshop of ICVS 2011

Workshop proceedings available on-line: here


Video understanding corresponds to the real time process of perceiving, analyzing and elaborating a semantic description of a 3D dynamic scene observed through a network of cameras and possibly other sensors. This process consists mainly in analyzing signal information provided by the sensors observing the scene with a large variety of models which humans usually use to understand the scene or defined purposely.

Computer vision and pattern recognition are the main technologies used for automatic monitoring of public spaces over extended durations. Effective approaches for tracking people, recognizing poses, postures, gestures, or collective crowd phenomena in public environments have been developed in the last 5 years, especially in the video surveillance context, aimed at classifying (suspect, unusual, abnormal) behaviours. However, the core problem of understanding still remains complex and needs to be improved to really address real-world situations.

The main challenge consists in the generation of qualitative and semantic descriptions of people or object motion up to the detailed description of body part configuration even in complex scenes. These goals have become a key task in many computer vision applications, such as image and scene understanding; health-care; video indexing and retrieval; video surveillance and advanced human-computer interaction.

The Key questions to be answer will be:

  • How far (i.e. more precise, longer activities) can we go with today technologies when analysis people behaviour?
  • How can we fill the gap between video signal and semantic activities?


Behaviour2011 will aim at promoting interaction and collaboration among researchers specialising in these related fields (but are by no means limited to):

  • People detection and Tracking;
  • Video activity discovery;
  • Group of people, crowd analysis;
  • Multi-camera and multimodal analysis;
  • High-level behaviour recognition and understanding;
  • Long term event recognition;
  • Use of ontologies on human motion for video footage;
  • Browsing, indexing and retrieval of human behaviours in video sequences;
  • Natural-language description of human behaviours;
  • Cognitive surveillance and ambient intelligence;
  • Learning models for behaviour analysis;
  • Human behaviour synthesis: articulated models and animation;
  • Real-time systems, system evaluation;
  • Abnormal event detection.

Important dates

Submission of full papers: New: 8th of July 2011 27th of June 2011
Notification of acceptance: 18th of July 2011
Camera Ready: 15th of August 2011
Workshop 2011: Friday September 23 2011, Sophia Antipolis (France)

Workshop program

All the sessions will be held in the Jacques Morgenstern amphitheatre, Gilles Kahn Building.
Friday, September 23, 2011

09:00 - 09:10  Welcome
Francois Brémond
09:10 - 09:50
Invited Talk: New trends in supporting ageing people with mild dementia in their own living space, Mounir Mokhtari: CNRS IPAL (UMI 2955) Singapore/ Institute for Infocomm Research (I2R/A-Star)/Institut TELECOM France.

Motivated by the growing of ageing population worldwide and the need to concentrate research efforts on a specific target group, our research focuses on ageing persons with physical and cognitive deficiencies. The primary goal is to enable the person with mild dementia, through assistive technologies, to maximize his physical and mental function, and to continue to engage in social networks, so that he can continue to lead an independent and purposeful life.

The person with mild dementia usually has little problems in the actual performance of tasks in most basic activities of daily living but is often handicapped by his poor memory and thus forgets to carry out these tasks. These can include bathing, changing clothes and taking medication on time. Assistive technology that provides timely prompts and reminders will enable him to preserve his abilities and independence.

Finding an appropriate interface to interact with persons with dementia can be a challenge. Indeed, the interaction should be done through a user-friendly interface by making it easier to access the environment and for the person to benefit from assistance. Introducing a desktop computer with a keyboard is not always well accepted. Thus, we have adopted the approach of providing assistive service through a multimodal interactive system including TV, iPad-like tablets and wireless speakers. Consequently, the reasoning level of the system may be calibrated according to the contextual situation of the user (context awareness) to bring about the service required. The User Interface should support not only the user’s preferences but also his profile as dictated by his cognitive abilities. At the same time, different surrounding computing platforms and devices/sensors are considered additional sources of information. In this paper we will focus mainly on the interaction level with the system as well as on the validation stages performed to meet users’ requirements in terms of user interface design and content of service. This is the result of more than 3 years work, from 2006 to 2010, within 2 Europeans projects (IST-FP6 Cogknow project and ITEA NUADU project) and an ongoing project in Singapore (AMUPADH).

09:50 - 10:50
Session 1:
Group interaction and group tracking for video-surveillance in underground railway stations (25')
Authors:  Sofia Zaidenberg, Bernard Boulay, Carolina Garate, Duc-Phu Chau, Etienne Corvée and Francois Brémond (INRIA Sophia-Antipolis, France)

Haar like and LBP based features for face, head and people detection in video sequences (25')
Authors:  Etienne Corvee, Francois Bremond (INRIA Sophia-Antipolis, France)

10:50 - 11:10 : Break
11:10 - 12:40
Session 2:
TowardsAMulti-purposeMonocularVision-basedHigh-LevelSituationAwarenessSystem (25')
Authors:  David Münch, Kai Jüngling, Michael Arens (Fraunhofer IOSB, Ettlingen, Germany)

Abnormal behavior detection in video protection systems (25')
Authors:  Luis Patino(1), Hamid Benhadda(2), Nedra Nefzi(1-2), Bernard Boulay(1), Francois Bremond(1), Monique Thonnat(1) ((1) INRIA Sophia-Antipolis, France, (2) THALES, France)

An unsupervised learning method for human activity recognition based on a temporal qualitative model (25')
Authors:  Franck Vandewiele and Cina Motamed (Laboratoire LISIC, Université du Littoral Côte d'Opale, Calais, France)

12:40 - 14:00 : Lunch
14:00 - 14:40
Invited Talk: Human Detection and Data Association in Multiple Camera Tracking, Richard Chang, Institute for Infocomm Research (I2R/A-Star)

Tracking multiple objects under merging, splitting, and occlusion situations is a challenging task for a surveillance system, especially when objects are close together. This is due to ambiguity and uncertainty of visual features from these objects as compared to a situation where tracked objects are isolated. To handle these challenges in multiple object tracking, techniques that apply observations from detected objects have been introduced. Joint probabilistic data association (JPDA) are one of them and calculate the probability of all possible data association between objects and observations. The advantages of JPDA and particle filter can then be combined to obtain better tracking results from human detection algorithm. The detection stage can be performed using Local Binary Pattern (LBP). LBP is a texture descriptor that combines local primitives into a feature histogram. LBP and its extensions outperform existing texture descriptors both with respect to performance and to computational efficiency. It is then suitable for foreground detection or background modelling.

Most methods for multiple camera tracking also rely on accurate calibration to associate data from multiple cameras. However, it is often not easy to have an accurate calibration in some real applications due to practical reasons. The inaccurate calibration can then lead to wrong data association of objects between cameras. To handle the data association of objects in multiple cameras under inaccurate ground plane homography, the RFS Bayes filter can be applied. Observations measurements can be modelled from cameras to a random finite set. This random finite set includes the primary measurement from the object, extraneous measurements of the object, and clutter. Experimental results including challenging cases such as occlusions and merging persons will be presented.

14:40 - 15:40
Session 3:
Video Activity Recognition Framework for assessing motor behavioural disorders in Alzheimer Disease Patients (25')
Authors:  Veronique Joumier(1), Rim Romdhane(1), Francois Bremond(1), Monique Thonnat(1), Emmanuel Mulin(2), Philippe Henri  Robert(2) ((1) Inria Sophia Antipolis, France, (2) Centre Mémoire de Ressources et de Recherche, CHU Nice, France)

Visual Synthetic Data Generation for Sing Language Recognition (25')
Authors:  Ahmed F. Ibrahim(1), Rasha F. Kashef(2), Fakhri Karray(1), Mohamed Kamel(1) ((1) Pattern Analysis and Machine Intelligence Research Group, Electrical and Computer Engineering Department, University of Waterloo, Waterloo, Canada, (2) College on computing and information technology, Arab Academy for Science and Technology, Cairo, Egypt)

15:40 - 17:15 : Demos available under request at PULSAR team (Borel Building).

Program committee

Jenny Benois-Pineau, University Bordeaux 1, LaBRI
Ni Bingbing, ADSC Singapore
Vittorio Murino, University of Verona, Italy
Shuicheng Yan, NUS Singapore
Nam Trung Pham, Institute for Infocomm Research, Singapore
Wang Yue, Institute for Infocomm Research, Singapore
Cyril Carincotte, Multitel, Belgium
Paolo Remagnino, Kingston University, United Kingdom
Alain Boucher, IFI-AUF, Vietnam
Marcos Zúñiga Barraza, Departamento de Electrónica - UTFSM, Chili


Francois Bremond INRIA Sophia Antipolis, France
Jose Luis Patino Vilchis INRIA Sophia Antipolis, France
Richard P. Chang Institute for Infocomm Research, Singapore
Karianto Leman Institute for Infocomm Research, Singapore
Jean-Marc Odobez IDIAP, Switzerland

