Rationale
Video understanding
corresponds to the real time process of perceiving, analyzing and
elaborating a semantic description of a 3D dynamic scene observed
through a network of cameras and possibly other sensors. This process
consists mainly in analyzing signal information provided by the sensors
observing the scene with a large variety of models which humans usually
use to understand the scene or defined purposely.
Computer vision and pattern recognition are the main technologies
used for automatic monitoring of public spaces over extended durations.
Effective approaches for tracking people, recognizing poses, postures,
gestures, or collective crowd phenomena in public environments have
been developed in the last 5 years, especially in the video
surveillance context, aimed at classifying (suspect, unusual, abnormal)
behaviours. However, the core problem of understanding still remains
complex and needs to be improved to really address real-world
situations.
The main challenge consists in the generation of qualitative and
semantic descriptions of people or object motion up to the detailed
description of body part configuration even in complex scenes. These
goals have become a key task in many computer vision applications, such
as image and scene understanding; health-care; video indexing and
retrieval; video surveillance and advanced human-computer interaction.
The Key questions to be answer will be:
- How far (i.e. more precise, longer activities) can we go with today technologies when analysis people behaviour?
- How can we fill the gap between video signal and semantic activities?
Topics
Behaviour2011 will aim at promoting interaction and collaboration among
researchers specialising in these related fields (but are by no means limited to):
- People detection and Tracking;
- Video activity discovery;
- Group of people, crowd analysis;
- Multi-camera and multimodal analysis;
- High-level behaviour recognition and understanding;
- Long term event recognition;
- Use of ontologies on human motion for video footage;
- Browsing, indexing and retrieval of human behaviours in video sequences;
- Natural-language description of human behaviours;
- Cognitive surveillance and ambient intelligence;
- Learning models for behaviour analysis;
- Human behaviour synthesis: articulated models and animation;
- Real-time systems, system evaluation;
- Abnormal event detection.
Important dates
Submission of full papers: |
New: 8th of July 2011 |
27th of June 2011 |
Notification of acceptance: |
18th of July 2011 |
Camera Ready: |
15th of August 2011 |
Workshop 2011: |
Friday September 23 2011, Sophia Antipolis (France) |
Workshop program
All the sessions will be held in the Jacques Morgenstern amphitheatre, Gilles Kahn Building.
Friday, September 23, 2011
09:00 - 09:10 Welcome
Francois Brémond
|
09:10 - 09:50
|
Invited
Talk: New trends in supporting ageing people with mild dementia in
their own living space, Mounir Mokhtari: CNRS IPAL (UMI 2955)
Singapore/ Institute for Infocomm Research (I2R/A-Star)/Institut
TELECOM France.
|
Abstract:
Motivated
by the growing of ageing population worldwide and the need to
concentrate research efforts on a specific target group, our research
focuses on ageing persons with physical and cognitive deficiencies. The
primary goal is to enable the person with mild dementia, through
assistive technologies, to maximize his physical and mental function,
and to continue to engage in social networks, so that he can continue
to lead an independent and purposeful life.
The person with mild dementia usually
has little problems in the actual performance of tasks in most basic
activities of daily living but is often handicapped by his poor memory
and thus forgets to carry out these tasks. These can include bathing,
changing clothes and taking medication on time. Assistive technology
that provides timely prompts and reminders will enable him to preserve
his abilities and independence.
Finding an appropriate interface to
interact with persons with dementia can be a challenge. Indeed, the
interaction should be done through a user-friendly interface by making
it easier to access the environment and for the person to benefit from
assistance. Introducing a desktop computer with a keyboard is not
always well accepted. Thus, we have adopted the approach of providing
assistive service through a multimodal interactive system including TV,
iPad-like tablets and wireless speakers. Consequently, the reasoning
level of the system may be calibrated according to the contextual
situation of the user (context awareness) to bring about the service
required. The User Interface should support not only the user’s
preferences but also his profile as dictated by his cognitive
abilities. At the same time, different surrounding computing platforms
and devices/sensors are considered additional sources of information.
In this paper we will focus mainly on the interaction level with the
system as well as on the validation stages performed to meet users’
requirements in terms of user interface design and content of service.
This is the result of more than 3 years work, from 2006 to 2010, within
2 Europeans projects (IST-FP6 Cogknow project and ITEA NUADU project)
and an ongoing project in Singapore (AMUPADH).
|
09:50 - 10:50
|
Session 1:
|
Group interaction and group tracking for video-surveillance in underground railway stations (25')
|
Authors: Sofia Zaidenberg, Bernard Boulay, Carolina Garate,
Duc-Phu Chau, Etienne Corvée and Francois Brémond (INRIA
Sophia-Antipolis, France) |
Haar like and LBP based features for face, head and people detection in video sequences (25')
|
Authors:
Etienne Corvee, Francois Bremond (INRIA Sophia-Antipolis, France)
|
|
11:10 - 12:40
|
Session 2:
|
TowardsAMulti-purposeMonocularVision-basedHigh-LevelSituationAwarenessSystem (25')
|
Authors:
David Münch, Kai Jüngling, Michael Arens (Fraunhofer IOSB, Ettlingen, Germany)
|
Abnormal behavior detection in video protection systems (25')
|
Authors: Luis Patino(1), Hamid Benhadda(2), Nedra Nefzi(1-2),
Bernard Boulay(1), Francois Bremond(1), Monique Thonnat(1) ((1) INRIA
Sophia-Antipolis, France, (2) THALES, France) |
An unsupervised learning method for human activity recognition based on a temporal qualitative model (25')
|
Authors: Franck Vandewiele and Cina Motamed (Laboratoire LISIC,
Université du Littoral Côte d'Opale, Calais, France) |
|
14:00 - 14:40
|
Invited
Talk: Human Detection and Data Association in Multiple Camera
Tracking, Richard Chang, Institute for Infocomm Research (I2R/A-Star)
|
Abstract:
Tracking
multiple objects under merging, splitting, and occlusion situations is
a challenging task for a surveillance system, especially when objects
are close together. This is due to ambiguity and uncertainty of visual
features from these objects as compared to a situation where tracked
objects are isolated. To handle these challenges in multiple object
tracking, techniques that apply observations from detected objects have
been introduced. Joint probabilistic data association (JPDA) are one of
them and calculate the probability of all possible data association
between objects and observations. The advantages of JPDA and particle
filter can then be combined to obtain better tracking results from
human detection algorithm. The detection stage can be performed using
Local Binary Pattern (LBP). LBP is a texture descriptor that combines
local primitives into a feature histogram. LBP and its extensions
outperform existing texture descriptors both with respect to
performance and to computational efficiency. It is then suitable for
foreground detection or background modelling.
Most methods for multiple camera
tracking also rely on accurate calibration to associate data from
multiple cameras. However, it is often not easy to have an accurate
calibration in some real applications due to practical reasons. The
inaccurate calibration can then lead to wrong data association of
objects between cameras. To handle the data association of objects in
multiple cameras under inaccurate ground plane homography, the RFS
Bayes filter can be applied. Observations measurements can be modelled
from cameras to a random finite set. This random finite set includes
the primary measurement from the object, extraneous measurements of the
object, and clutter. Experimental results including challenging cases
such as occlusions and merging persons will be presented.
|
14:40 - 15:40
|
Session 3:
|
Video Activity Recognition Framework for assessing motor behavioural disorders in Alzheimer Disease Patients (25')
|
Authors: Veronique Joumier(1), Rim Romdhane(1), Francois
Bremond(1), Monique Thonnat(1), Emmanuel Mulin(2), Philippe Henri
Robert(2) ((1) Inria Sophia Antipolis, France, (2) Centre Mémoire de
Ressources et de Recherche, CHU Nice, France) |
Visual Synthetic Data Generation for Sing Language Recognition (25')
|
Authors: Ahmed F. Ibrahim(1), Rasha F. Kashef(2), Fakhri
Karray(1), Mohamed Kamel(1) ((1) Pattern Analysis and Machine
Intelligence Research Group, Electrical and Computer Engineering
Department, University of Waterloo, Waterloo, Canada, (2) College on
computing and information technology, Arab Academy for Science and
Technology, Cairo, Egypt) |
|
15:40 - 17:15 : Demos available under request at PULSAR team (Borel Building).
|
|
Program committee
Jenny Benois-Pineau, |
University Bordeaux 1, LaBRI |
Ni Bingbing, |
ADSC Singapore |
Vittorio Murino, |
University of Verona, Italy |
Shuicheng Yan, |
NUS Singapore |
Nam Trung Pham, |
Institute for Infocomm Research, Singapore |
Wang Yue, |
Institute for Infocomm Research, Singapore |
Cyril Carincotte, |
Multitel, Belgium |
Paolo Remagnino, |
Kingston University, United Kingdom |
Alain Boucher, |
IFI-AUF, Vietnam |
Marcos Zúñiga Barraza, |
Departamento de Electrónica - UTFSM, Chili |
Organizers
Francois Bremond |
INRIA Sophia Antipolis, France |
Jose Luis Patino Vilchis |
INRIA Sophia Antipolis, France |
Richard P. Chang |
Institute for Infocomm Research, Singapore |
Karianto Leman |
Institute for Infocomm Research, Singapore |
Jean-Marc Odobez |
IDIAP, Switzerland |
Call for papers (pdf verfison)
|