3D data plays a crucial role in all related vision area, such as in robot vision. A service robot usually perceives the world from only one perspective. Because of this drawback, the robot's capabilities are limited. The global structure of the 3D object, needed by any grasp or avoidance task, is impossible to be directly extracted. Hence, by using a series of a priori information about the considered object, a compact closed surface (volume) representation can be obtained. Because the a priori model has a generic shape, a modelling step is used to particularize it through a deformation process. Such a deformation example is illustrated in Fig. 1.
The main goal of the modelling approach is to obtain a refined 3D shape which best fits the real object from the scene. Given two Point Distribution Models (PDM), describing the perceived object and representing the primitive, the first task is to determine which transformation matrix optimally overlap the provided shapes. Having them align, the next problem aims to resolve the movement of each primitive point along its normal direction in search of dense scene information. In the same time, additional rules (e.g. shape integrity conservation or closest correct attraction surface) have to be followed.
A generic primitive , is considered to be an a priori shape describing a particular object. This information is stored, along with other objects, in a database connected to the modelling apparatus presented. Not all 3D points are relevant for the primitive structure. For example, many of them are used only for the purpose of creating a smooth object surface. In this sense, each feature will receive a special flag or type. Thus, two point types are defined: control and regular points. A point which has received the flag is considered to be definitive for the backbone structure of that particular model whereas a regular point with simply to smooth the global structure of the shape. Fig. 1 presents a generic primitive resembling a common bottle.
Since each shape is defined in a local frame, a common coordinate system needs to be determined. For simplicity, the coordinate system of the perceived object will be consider as the reference, summarizing the alignment process in finding the transformation between the primitive and the imaged object. The scale factor is determined as the radius of the circumscribed sphere, while the rotation is found using an Euclidean distance minimization approach. The final rotation is calculated by a fine rotation identification procedure using an ICP algorithm. The following equations aligns the input models to a common frame:
Shape Modelling using numerical methods
The purpose of the modelling process is to fit the primitive model to the imaged object. Through this step, the primitive must capture the local geometry information directly from the scene. It is called local because each feature extracts the information only from a controlled vicinity. If that particular vicinity lacks sensed information, then the generic primitive fill up the missing data. To make the entire process time efficient, the modelling step will occur only for control points, while the rest of the points will be repositioned relative to these control points using a linear post modelling process. The movement of each point is controlled by a series of energies through a minimization process which establishes the final position of the points. This technique is manly related to Active Contours. Two types of energies can be distinguished: and . The minimization task can be formulated as:
The internal energy is responsible with kipping the structure as smooth and continue as possible, whereas the external energy drives the points to their final position. Fig. 2 presents the behavior of the initial contour under the influence of such forces.
Shape Modelling using GFPNet
GFPNet is a neural network that is specially designed to improve the previous numerical-based shape modeling approach where the generic primitive shape was modeled using a first and second-order differential equations solver used to compute internal and external shape contour energies. In the GPFNet method, the numerical solver is replaced by a DNN able to better capture the characteristics of the surface while being able to deal with surfaces with strong deformations. The following videos describe the modelling process in action applyed for completing the shape of two objects, a mug, respectivelt a vehicle car body.
T.T. Cocias, A. Razvant and S.M. Grigorescu "GFPNet: A Deep Network for Learning Shape Completion in Generic Fitted Primitives", IEEE Robotics and Automation Letters, 2020.
T.T. Cocias, F. Moldoveanu and S.M. Grigorescu "Generic Fitted Shapes (GFS): Volumetric Object Segmentation in Service Robotics", Robotics and Autonomous Systems, Elsevier, Netherlands, 2013.
T.T. Cocias, F. Moldoveanu and S.M. Grigorescu "Generic Fitted Primitives (GFP): Towards Full Object Volumetric Reconstruction for Service Robotics", Proceedings of the 21st Int. Conf. in Central Europe on Computer Graphics, Visualization and Computer Vision, Plzen, Czech Republic, June 24-27, 2013.
T.T. Cocias, S.M. Grigorescu and F. Moldoveanu "Object Volumetric Estimation Based on Generic Fitted Primitives for Service Robotics", Proceedings of the International Conference on Computer Vision Theory and Applications, Rome, Italy, February 24 - 26, 2012.