In the case of continuously operating Active Vision systems, which interact with their environment (such as an industrial robot on a mobile platform), it is highly desirable that it is possible to achieve a "closed loop" of information flow. This may not be immediately achievable as a Real Time Application on a physical mobile platform, but it may be achievable in a testbed which sufficiently well simulates the real world components (environment, cameras, movement and so on).
We would like to re-emphasize this investigation into mechanisms and control structures for restrictive processing in continuous-operation vision systems. This is due to the computational demands resulting from performing robot vision in real time. It includes mechanisms for goal directed perceptual strategies such as focus of attention, fovea, and spatio-temporal scale-space, with variable space and time resolution sampling to provide sufficiently good representations of control mechanisms. The real-time restriction is interesting and important for a number of reasons:
In order to be acceptable this real-time system is :
Therefore, such a system consists of a small kernel to which problem-oriented code are added as modules.
However, it is crucial that the kernel is properly decomposed and layered, with well-defined interfaces to every part of the kernel, so that it will be possible to choose alternative implementations of one or several services in the kernel. One reason for choosing an alternative implementation is increased efficiency, another is integration and compatibility with other systems.
For the same reason, the kernel must not contain anything that could just as well be implemented in add-on libraries. For example, support for the EUI data structures should be kept out of the kernel. For distributed systems, the naming of processes, objects and services could be implemented by a general, replaceable name service in user-space. Scheduling should be implemented a user-defined scheduler, and must not be hard-wired into the kernel. Support for the transportation and distribution of objects should be based on protocols, which allow for multiple programming languages, transport layers, and object persistence implementations.
Some systems invent their own specification languages, scripting languages, graphical interface, when there are existing, generally accepted alternatives. The use of mainstream components increases the probability that the system will be accepted and used.
Many existing distributed image processing system (AVS, Khoros etc) are based on computational networks, consisting of filters which operate on data streams. While this abstraction might be appropriate for some image processing problems, reactive real-time systems needs more control over the computations. There is a strong trend towards distributed object-oriented systems, in which the programmer gets the feeling that all objects exists in a singe process, but where some method invocations are actually implemented by transparent remote procedure calls, shared buffers access, etc (CORBA, DOE, Spring and others).
It is also important in order to create embedded applications where you do not need to "download" more than what is actually used. Moreover, it could enable a "progressive learning scheme" so that a newcomer does not need to learn more than is basically needed for his application to get started.
The real-time restriction is interesting and important also for a number of reasons, related industrial applications :
In terms of building real-time systems there are several commercially available operating systems, and recently tools for specification of such systems have also become available in the academic community. It is, however, characteristic that the real-time tools available today all assume the real-time systems are homogeneous, from a complexity point of view, and typically only a few selected routines require real-time response. In fully fledged computer vision systems the need for guaranteed response times varies from a few milliseconds (low-level control loops) to several seconds (symbolic interpretation and planning tasks). The requirements, in terms of response times, are consequently highly heterogeneous. In addition computer vision and image analysis is characterized by an excessive dataflow (6-20 MB/s), this in turn requires that an execution environment must be able to create and destroy large data structures without any effect on the performance of the system. On top of this facilities for control of processing, analysis and interpretation must be supported/provided to ensure limited size models.