The general goal of environment perception task was to build a modular virtual receptor utilizing data from the RGB-D data sensors. During design of the receptor a set of key functional modules has been identified. This set is naturally divided into a subset of 3D analysis and a subset of 2D analysis. The 3D analysis modules are used for the near-field analysis, while the 2D analysis modules are used for the far-field analysis where no reliable depth data is available. The identified 3D analysis modules are as follows:

The identified 2D analysis modules are as follows:

All the above modules were designed and implemented according to the system engineering principles. The said modules are described in detail below.

3D analysis modules

3D occupancy map module

The 3D occupancy map module incrementally builds robot's neighborhood map as the sensor moves or rotates. We have adopted surfel (surfel = surface element) representation of occupancy map instead of widely used point representation. This change is helpful at later stage of joining surfels to build up surfaces of environment objects, because surfel normal vectors enter criteria for surfel orientation closeness. We have chosen and tested the solution for loop closing problem, and therefore our 3D occupancy map does not suffer from the position drift accumulation.

3D segmentation module

The 3D segmentation module uses the RGB-D data to produce textured surface patches representing distinct parts of the environment. The algorithm for segmentation is based on point cloud clustering algorithm, reinforced with normal vectors continuity criteria. The algorithm is robust to momentary data loss and allows setting weights of information components such as color, normal vector, and depth. Moreover, the algorithm not only detects surface patches, but also build group of patches based on a novel patch similarity criterion.

3D shape recognition module

The 3D shape recognition module creates basic shapes, namely cone, cylinder, plane, parallelepiped, ellipsoid. We solved problems arising at patch boundaries and the resultant shapes are built from surface patches. The recognized shapes are fed into the 3D object modelling module which can group them into real world objects. Additionally, the module allows to test numerous spatial properties, like touching, crossing, convexity, etc.

Ontology-based 3D object modelling module

The 3D object modelling module has a form of a logical skeleton (tree) structure, called concept. Leaves of the concept can be formed of/contain all entities present in our system, namely basic 3D shapes, 2D shapes, texture classes, or derived information of these. Edges of the tree are relations of the compound entity standing on the top to the entities one level down, which in the simplest case can be thought of as relation of the elements of the set to the set as a whole. Therefore, concepts can be identified with object models.

Ontology-based reasoning module
The reasoning module allows to build instances of concepts mentioned above and refer to physical objects found in the scene. In order to build instances of such objects while having its constituents only we implemented a knowledge database and a reasoning system. Our reasoning process is a search for the global solution by means of browsing partial solutions and applying decisive rules, just like matching puzzle pieces together. We solved the problem of non-complete solutions which arise naturally when the object is only partially visible.

2D analysis modules

2D image segmentation module

The 2D image segmentation module is an intermediate module, not deriving any useful output outside the 2D analysis module, but instead it provides input to the 2D shape classification module and the 2D texture classification module. Its role is purely technical, and it helps to knock down the computation time by restricting regions of interest in further processing.

2D shape classification module

The 2D shape classification module takes a segmented RGB image patch or the whole image and detects basic shapes (ellipse arcs and segments). Those shapes are then merged into chains which are in turn classified against a database of interesting shape classes. The resulting shape class helps to identify distant objects.

2D texture classification module

The 2D texture classification module takes a segmented RGB image patch or the whole image and computes a feature vector for this patch. The vector describes the patch as a whole, hence global features only are taken into account. The texture vector computation process is an exchangeable submodule. The classification process uses the boosting paradigm, allowing to build a strong classifier by stacking a bunch of base classifiers. The base classifier is also exchangeable. The resulting texture class helps to identify distant objects by restricting its role in the environment.