Interpretation |
The objective of this state model is to provide a set of generic states based on a formalism which enables its extension and its parametrization. A state of the scene is defined by an n-ary tree which represents the way this state is computed. Four types of nodes are distinguished: object nodes, descriptor nodes, operator nodes and classifier nodes (see below for their definition). The root node is a classifier node. The leaves of this tree are objects nodes. Father nodes of the leaves are descriptors nodes. All other intermediate nodes are operators nodes.
The minimal tree structure is reduced to 3 nodes, 1 classifier root node, 1 descriptor intermediate node and 1 object terminal node. The number of branches of the tree and the length of the branches are free.
Given this model, for each image a set of generic predefined states are instantiated with the current objects detected in the scene at that time. The resulting set of instantiated states provide a description of the scene at that time. Event recognition is performed by comparing this new set with those obtained during the preceding times. The states for which the symbolic value has changed create new events.
![]() |
![]() |
The three classes of objects are person, area, and equipment.
The persons are the mobile objects of the scene which have been recognized as
human. The previous steps provide a vector
representing the location of the person on the ground, a vector
representing the speed vector of the person and the size h of
that person. An area is a static object representing a subpart of the ground
of the scene with a polygon
An equipment
represents any volumic object of the environment for which we know the
polygonal basis
and the height h.
We have defined 4 nodes with the descriptor type:
position, size, speed and shape.
applied to an object of the class <<person>> give access to
the location of the person.
applied to an object of the class <<person>> or to an
object of the class <<équipment>> enables us to recover the size h of the
object.
applied to an object of the class
<<person>> returns the speed vector
applied to an object of the class <<equipment>> or
<<area>> returns the polygon
associated to
this object.
We have defined 4 nodes of the type operator:
distance, norm, angle and constr. distance,
,
is a binary operator computing the euclidean distance.
norm,
,
is an operator computing the norm of a vector.
angle,
,
is an operator computing
the angle between two vectors in degrees. constr,
,
is an operator which constructs a 2D vector with its
scalar components.
We have defined 8 nodes with the type classifier:
Based on these classifiers, operators, descriptors and objects we have defined 8 states: posture, direction, velocity, location, proximity, relative location, relative posture and relative walk.
For instance we have defined the state (see figure 6) relative walk(operson, i, operson, j by measuring the angle between the speed vectors of operson, i and operson, j and the distance between operson, i and operson, j). If the speed vectors have a similar orientation (an angle below 45 degrees or greater than 315 degrees) and if the distance is small (below 200cm) then these persons are considered as having a coupled relative walk.
This enables us to define 18 events.
Posture(operson, i) changes create the events operson, i falls down, operson, i crouches down and operson, istands up.
Direction(operson, i) changes create the events operson, i goes right side, operson, i goes left side, operson, i goes away and operson, i arrives.
Velocity(operson, i) changes create the events operson, i stops, operson, i walks and operson, i starts runing.
Location(operson, i, oarea, j) changes create the events operson, i leaves oarea, j and operson, i enters oarea, j.
Proximity(operson, i,
oequipment, j) changes create the events
operson, i
moves close to
oequipment, j
operson, i
moves away from
oequipment, j.
Relative location(operson, i, operson, j) changes create the events operson, i moves close to operson, j and operson, i moves away from operson, j.
Relative posture(operson, i, oequipment, j changes create the events operson, i sits on oequipment, j.
Relative walk(operson, i, operson, j) changes create the events operson, i and operson, j walk together.
To recognize a scenario implies to recognize all the events which compose it and to verify the constraints of the dependencies. The constraints can be temporal, spatial, logical or algebraic. A scenario can be:
The principle of the scenario recognition consists of two points: as
previously described, we generate, image after image, interesting events
which happened in the scene, then with those events we instantiate predefined
scenario models. It means that scenario recognition corresponds to
updating a set of partially recognized scenarios.
We will now give details of the scenario model we use. A scenario si, t, where i is the scenario identifier and t the current time of recognition, is composed of four parts: events, constraints, conditions, and success.