Translated Titles

3D Object Detection and Pose Estimation from a Depth Image





Key Words

物體偵測 ; 姿態估測 ; 立體物件 ; Object Detection ; Pose Estimation ; 3D object



Volume or Term/Year and Month of Publication


Academic Degree Category




Content Language


Chinese Abstract

在本篇論文中,我們提出一個系統,此系統於多個物件下的深度圖中自動定位物體並得到其相對應的姿態估測,並可以將之拓展至機器人之應用上。所提出的物體定位演算法,最主要是根據關鍵點的擷取,並結合FPFH特徵描述法及RANSAC演算法尋找正確的對應關係。 關鍵點的偵測主要是從針對二維影像之Harris偵測方法延伸而來,此方法於立體空間中利用網格的關係建立點與點的資訊,並進一步以一個點以及其周圍的點為資訊應用並實作為三維Harris偵測方法,接著以FPFH特徵描述法描述關鍵點中各個關鍵點所代表的特徵,計算並找到相似的對應點集合,最後利用幾何RANSAC演算法,結合幾何的特性從相似的對應點集合中選取正確的組合。在我們所提出的系統中,結合了關鍵點偵測以及RANSAC演算法以偵測物體,並利用ICP演算法修正物體定位的結果,並利用正確的對應點計算剛體轉換並得到姿態估測。 在我們的實驗,我們藉由立體的物件模型以及模擬生成的深度資料中來評估我們所提出的方法在姿態估測的誤差,最後並展示其在真實資料的實驗結果。

English Abstract

In this thesis, we propose a system for automatic object detection and pose estimation from a single depth map containing multiple objects for robot applications. The proposed object detection algorithm is based on matching the keypoints extracted from the depth image by using the proposed geometry-based RANSAC algorithm with the FPFH descriptor. The keypoint detection method used in this work is extended from the 2D Harris corner detector to the 3D Harris corner detector. Then, similar corresponding points with FPFH feature are extracted based on their distance. The proposed geometry-based RANSAC algorithm integrates the characteristics of the geometry to choose the inliers from similar corresponding points. In the proposed system, we combine the keypoint detection and the geometry-based RANSAC algorithm to detect the objects, followed by the ICP algorithm to refine the 3D object alignment. We exploit the corresponding points to calculate the rigid transformation for pose estimation. In the experimental results, simulated and real world depth data are shown to demonstrate the accuracy of pose estimation by using the proposed system.

Topic Category 基礎與應用科學 > 資訊科學
電機資訊學院 > 資訊系統與應用研究所
  1. [1] A. Patterson, P. Mordohai and K. Daniilidis, “Object Detection from Large-Scale 3D Datasets using Bottom-up and Top- down Descriptors,” ECCV 2008.
  2. [2] A. Frome, D. Huber, R. Kolluri, T. Bulow, and J. Malik, “Recognizing Objects in Range Data Using Regional Point Descriptors,” ECCV 2004.
  3. [3] H. Yokoyama, H. Date, S. Kanai and H. Takeda,” Detection and Classification of Pole- like Objects from Mobile Laser Scanning Data of Urban Environments,” ACDDE 2012.
  4. [4] M. Lehtomaki, A. Jaakkola, J. Hyypp ¨ a, A. Kukko, H. Kaartinen, “Detection of Vertical Pole- Like Objects in a Road Environment Using Vehicle- Based Laser Scanning Data,” Remote Sensing 2010.
  5. [5] B. Steder, G. Grisetti, M. V. Loock and W. Burgard, ” Robust On-line Model-based Object Detection from Range Images,” IROS 2009.
  6. [6] H. Koppula, A. Anand, T. Joachims and A. Saxena, “ Semantic Labeling of 3D Point Clouds for Indoor Scenes,” NIPS 2011.
  7. [7] R. B. Rusu, Z. C. Marton, N. Blodow, M. Dolha, and M. Beetz, “Towards 3D Point Cloud Based Object Maps for Household environments,” RASJ, 2008.
  8. [9] R. Schnabel, R. Wahl, and R. Klein, “Efficient RANSAC for Point-Cloud Shape Detection,” CGF, Vol. 26, no. 2, pp. 214-226, June 2007.
  9. [10] M.-Y. Liu, O. Tuzel, A. Veeraraghavan, Y. Taguchi, T. Marks, and R. Chellappa,”Fast object localization and pose estimation in heavy clutter for robotic bin picking,” IJRR 2012.
  10. [11] A. Johnson and M. Hebert,” Object recognition by Matching Oriented Points,” CVPR 1997.
  11. [12] S. Ruiz-Correa, L. G. Shapiro, and M. Melia,” A New Signature-based Method for Efficient 3-D Object Recogni tion,” CVPR 2001.
  12. [13] I. Sipiran, and B. Bustos ,”Harris 3D: a robust extension of the Harris operator for interest point detection on 3D meshed,” Vol. 27, No. 11, pp. 963-976, Visual Computer, 2011.
  13. [14] R. B. Rusu, N. Blodow, Z. C. Marton, and M. Beetz,”Aligning Point Cloud Views using Persistent Feature Histograms ,” IROS 2008
  14. [15] D. B. Gennery, “Visual tracking of known three-dimensional objects,” IJCV, 7(3):243-270,1992.
  15. [16] T. Sattler, B. Leibe, and L. Kobbelt, “Fast Image-Based Localization using Direct 2D-to-3D Matching,” ICCV 2011.
  16. [18] I. Sipiran, and B. Bustos ,”Harris 3D: a robust extension of the Harris operator for interest point detection on 3D meshed,” Vol. 27, No. 11, pp. 963-976, Visual Computer, 2011.
  17. [19] A. J. Colmenarez and T. S. Huang, “Face detection with information-based maximum discrimination,”, CVPR 1997
  18. [20] Y. Amit and D. Geman, “ A computational model for visual selection,” Neural Computation, vol. 11, no. 7, pp. 1691-1715, 1999
  19. [23] H. A. Rowley, S. Baluja, and T. Kanade, “Neural nwtwork-based face detection,” PAMI, vol. 20, no. 1, pp. 23-38, 1998
  20. [25] H. Schneiderman and T. Kanade, “ A statistical method for 3d object detection applied to face and cars,” CVPR, vol. 1, pp. 746-751, 2000.
  21. [26] A. Mohan, C. Papageorgion, and T. Poggio, ”Example-based object detection in images by components,” PAMI, vol.23, pp. 349-361, 2001.
  22. [28] M. A. Fichler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, 24(6): 381-395, 1981.
  23. [30] T. ZinBer, J. Schmidt, and H. Niemann,” A Refined ICP Algorithm For Robust 3-D Correspondence Estimation,” ICIP 2003.
  24. [32] R. B. Rusu, N. Blodow and M. Beetz, “ Fast Point Feature Histograms (FPFH) for 3D Registration,” ICRA 2009.
  25. References
  26. [8] G. Vosselman, B. Gorte, G. Sithole, T. Rabbani,” Recognising Structure in Laser Scanner Point Clouds,” IAPRS 2004.
  27. [17] C. Harris and M. stephens, “A Combined Corner and Edge Detector,” Manchester, Proceedings of the 4th Vision Conference, 1988, pp.147-151.
  28. [21] L. Shams and J. Spoeslstra, “Learning Garbor-based features for face detection,” INNC 1996
  29. [22] C. Papagorgiou and T. Poggio, “A trainable system for object detection,” IJCV, vol. 38, no. 1, pp. 15-33, 2000.
  30. [24] Y. Lecun, P. Haffiner, L. Bottou, and Y. Bengio, “ Object recognition with gradient-based learning,” In Feature Grouping, D. Forsyth, Ed., 1999.
  31. [27] S. Ullman, E. Sali, and M. Vidal-Naquest, “A fragment-based approach to object representation and classification,” in Proceedings of the Fourth International Workshop on Visual From, Eds, 2001, pp.85-100.
  32. [29] G. G. Slabaugh, “ Computing Euler angles from a rotation matrix,” http://www.soi.city.ac.uk/~sbbh653/publications/euler.pdf
  33. [31] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon, “ KinectFusion: Real-Time Dense Surface Mapping and Tracking,” ISMAR 2011.
  34. [33] K. Baker, ”Singular Value Decomposition Tutorial”, http://www.ling.ohio-state.edu/~kbaker/pubs/Singular_Value_Decomposition_Tutorial.pdf.