Object tracking and scenario recognition for video-surveillance

Authors: François Brémond and Monique Thonnat


In this paper we address the issue of moving region tracking and scenario recognition for scene interpretation systems. The class of applications we are interested in, is the automatic surveillance of real-world scenes with a fixed monocular color camera. Given image sequences of a scene, the interpretation system has to recognize scenarios relative to the behaviors of humans or vehicles \cite{bre97b}. In this paper we focus on the connections between the tracking module and the scenario recognition module. \section{Scenario recognition} We have developed an interpretation system which is composed of three modules. The {\it detection module} detects the moving regions. Then the {\it tracking module} tracks the detected regions. Then the {\it scenario recognition module} generates hypotheses to consider the tracked moving regions as mobile objects composed of one or more regions. Finally the scenario recognition module computes mobile object properties, and, analyzes the scenarios relative to the behavior of mobile objects based on their properties (e.g. height) or the evolution of their properties (e.g. increase). Up to now we are mainly using eight mobile object properties. These properties are computed on a short time interval to balance errors due to bad detection conditions. The scenarios are defined recursively from scenario elements so they can describe activities on a long time interval. The main characteristic of this representation is to be generic and flexible enough to easily describe human activities. There are two types of scenarios~: {\it non temporal} and {\it temporal}. First, a scenario can represent a non temporal constraint on a set of sub-scenarios and properties. Second, a scenario can represent a temporal sequence of sub-scenarios. If the scenario represents a non temporal constraint, then a scenario recognition value quantifies the constraint verification. A likelihood degree is computed through a diagnosis stage. If the scenario represents a temporal sequence, then it is recognized through an automaton, the states of which represent the sub-scenarios. A scenario recognition value quantifies the current state of recognition. A likelihood degree and the automaton transitions are computed through the likelihood degree of its sub-scenarios. The scenario is recognized when all its sub-scenarios recognition values and likelihood degrees are high enough. In cluttered scenes moving regions are often partially detected or lost or mixed with other moving regions. So in many cases scenarios cannot be recognized because of tracking failures. To improve the tracking process we propose two mechanisms that use the results of scenario recognitions. The first mechanism uses the scenario recognition as an additive information to solve ambiguous correspondences; it validates the ambiguous moving region with the highest likelihood degree scenario. The second one uses a scenario where a mobile object behaves like a noise; this scenario is based on the evolution of the mobile object size, of its speed and of its trajectory. Thus a reliable scenario recognition enhances the tracking process and a robust tracking process is needed to get a reliable scenario recognition. To break this dead lock we analyze scenarios as soon as the tracking of moving regions begins, even with inaccurate data. Then the likelihood degree of scenarios indicates when their results can be used. \section{Conclusion} We have tested our system for car park and metro video-surveillance applications. The results show the benefits of making cooperate the tracking process with the scenario recognition. They also show that scenarios can continue to be recognized in some situations even with an inaccurate tracking process. Another way to obtain reliable results is to define scenarios using contextual information as described in \cite{bre97b}.

Keywords: vision, knowledge representation, diagnosis, application

BibTeX reference:

        AUTHOR             = {F. Brémond and M. Thonnat},
        BOOKTITLE          = {Proc. of the {I}nternational {J}oint
		  {C}onference on {A}rtificial {I}ntelligence (IJCAI'97)},
        MONTH              = aug,
        TITLE              = {Object tracking and scenario recognition for video-surveillance},
        YEAR               = {1997}

 Dernière mise à jour : 15/03/01