Visual recognition for intelligent robot and car
Y. Hirano

 

Recently application of object recognition to intelligent robots and cars is rapidly growing.
For autonomous robots like future service robots, object recognition in general scene is very important.
Necessary function of these robots are recognition of objects to be handled and also of obstacles and 3D environment for autonomous moving both in cluttered scene. If knowledge or database about the shape and specific local descriptors for each object is given, it is possible to detect and recognize objects by the matching of those descriptors in a clutter.
It is also possible to estimate position and orientation of the known objects by the same way as camera pose estimation. For non-textured objects, a descriptor using the contour information can be used.
On the other hand, if there is no such database but most of the objects can be fitted to simply shaped primitives, only separating each object and estimating the position and orientation make sense for grasping those objects by robot hand. For 3D reconstruction of objects and obstacles, we used dense depth matching and voxelization of obtained 3D images.
Though, there still are problems to be solved for these methods when applying to actual system such as accuracy, calculation speed and so on.
For cars, there already are some systems utilizing visual recognition as bellows:
- Lane departure warning and lane keeping assist system using white lines detection.
- Detection of obstacles in front of the vehicle using stereo image.
- Pedestrian detection and warning system using infrared image.
Though, still many applications for future intelligent vehicles to prevent potential traffic accidents and also to assist driving are expected.
For the realization of those future systems, recognition and prediction of the motion of pedestrians, other cars, other bikes etc. will be
necessary. Also recognition of traffic signs, signals etc. and also segmentation and categorization of road area, sidewalk, guardrail, crosswalk, crossroad, etc. will be necessary.
Furthermore, to expect future possible dangers and prevent them, scene understanding considering the context of the scene will become important.
However, there still are many difficulties for these tasks such as heavy occlusion, very big variation of whether, lighting condition, appearance of the objects and so on. To improve both error rate of miss-detection and erraneous detection drastically is still needed.

 

presentation