Object recognition is a central problem in computer vision. Typically
it is assumed to follow a sequential model in which successively more
specific hypotheses are generated about the image. This is a rather
simplistic model, allowing as it does no margin for error at any point. We
follow a more general approach in which the various representations
involved are allowed to influence one another from the outset. As a guide
and ultimate goal, we study the problem of finding the region occupied by
human beings in images, and the separation of the region into arms, legs
and head. We approach the problem as that of defining a functional on the
space of boundaries in images whose minimum specifies the region occupied
by the human figure.
Previous work that uses such functionals suffers from a number of
difficulties. These include an uncontrollable dependence on scale, an
inability to find the global minimum for boundaries in polynomial time,
and the inability to include region as well as boundary information. We
present a new form of functional on boundaries in a manifold that solves
these problems, and is also the unique form of functional in a specific
class that possesses a non-trivial, efficiently computable global minimum.
We describe applications of the model to single images and to the
extraction of boundaries from stereo pairs and motion sequences.
In addition, the functionals used in previous work could not include
information about the shape of the region sought. We develop a model for
the part structures of boundaries that extends previous work to the case
of real images, thus including shape information in the functional
framework. We show that such part structures are hyperpaths in a
hypergraph. An `optimal hyperpath' algorithm is developed that globally
minimizes the functional under some conditions.
We show how to use exemplars of a shape to construct a functional that
includes specific information about the topology of the part structure
sought. An algorithm is developed that globally minimizes such functionals
in the case of a fixed boundary. The behaviour of the functional mimics an
aspect of human shape comparison. |