Monocular Vision Based Simultaneous Localization and Mapping for a Wheeled Robot

Translated Titles






Key Words

即時定位 ; 機器人 ; 單眼視覺 ; SLAM ; EKF ; Mono



Volume or Term/Year and Month of Publication


Academic Degree Category




Content Language


Chinese Abstract

在機器人研究領域,機器人同步定位與建立地圖(Simultaneous localization and mapping, SLAM)是個越來越受到重視的主題,SLAM要求當機器人被丟在未知位置且不知道週遭環境時,藉由觀察環境來建立前後一致的地圖,並自我定位於地圖中的位置。直覺想到的機器人里程器的資訊會有幫助,但里程器的量測誤差會一直累積;藉由別的量測儀器的幫助,像是超音波測距儀、攝影機,再由濾波器回授,SLAM可以做到更精確的地圖。SLAM最常用的是擴展的卡爾曼濾波器(Extended Kalman Filter)。 在這篇論文中,提出一個使用單眼視覺於輪型機器人同步定位與環境地圖建立的方法,只用一台網路攝影機的影像來修正里程器的數據。影像用 D. Lowe 提出的 SIFT 來取出特徵點並比對,當特徵點出現在和預測的地方不同,EKF利用其間的差距來修正建出的地圖。但修正時需要特徵點的深度資訊,無法提供深度是使用單眼視覺的最大問題,這篇使用 J. Civera 提出的深度倒數座標來表示特徵點的三維位置,這個座標表示法的好處是更符合EKF的線性個質,使EKF對特徵點的位置能更快地收斂。但 J. Civera 只有使用一個攝影機來進行即時定位與建立地圖,缺乏里程器回傳的資料,沒有速度的資訊是很可惜的,畢竟現在許多交通工具都具備里程器功能,這個系統結合了里程器,經過室內影像的測試,能提供了精確的地圖。

English Abstract

Simultaneous localization and mapping (SLAM) becomes popular in the autonomous robot research community. SLAM requests a mobile robot, placed at an unknown location in an unknown environment, to incrementally build a consistent map of this environment while simultaneously determining its location within this map. The odometer data is helpful but the noisy error accumulates. Assistant with measurements from the other sensors, SLAM improves the accuracy by filtering. The extended Kalman filter is the most used method to feedback the error. In this thesis, a monocular vision based SLAM for a wheeled robot is proposed. A single web camera is the only measurement input to correct the odometer data. Extract features and match them based on the scale invariant feature transform (SIFT). Absence of the feature depth information is the key issue for the monocular vision in SLAM. The inverse depth parametrization method solves the problem by present the position in inverse coordinate with some reasonable assumptions. Convergence of the feature position is more easily achieved by the EKF due to the linearity. Combining the odometer data with the inverse depth method, this system provides an accurate map by tests of the indoor image sequence.

Topic Category 電機資訊學院 > 電機工程學系所
工程學 > 電機工程
  1. [1] R. Smith, M. Self, and P. Cheeseman, “Estimating uncertain spatial relationships in robotics. In Autonomous Robot Vehnicles,” Springer, 1990.
  2. [2] G. Dissanayake, P. Newman, H. Durrant-Whyte, S. Clark, and M. Csobra, “A solution to the simultaneous localisation and mapping (SLAM) problem,” IEEE Transactions on Robotics and Automation, vol. 17, no. 3, pp. 229–241, 2001.
  3. [5] L. Paz, P. Jensfelt, J. D. Tards, and J. Neira, “EKF SLAM Updates in O(n) with Divide and Conquer,” in 2007 IEEE Int. Conf. on Roboticsand Automation, Italy, 2007
  4. [6] M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit, “Fast-SLAM: A factored solution to the simultaneous localization and mapping problem,” Proc. AAAI Nat. Conf. Artificial Intelligence, 2002, pp. 593–598.
  5. [9] S. Thrun, W. Bugard, and D. Fox, “A real-time algorithm for mobile robot mapping with applications to multi-robot and 3D mapping,” in Proceedings of the IEEE International Conference on Robotics and Automation, 2000, pp. 321–328
  6. [10] H. Surmann, A. Nuchter, and J. Hertzberg, “An autonomous mobile robot with a 3D laser range finder for 3D exploration and digitalization of indoor environments,” Robotics and Autonomous Systems, vol. 45, pp. 181–198, 2003.
  7. [13] A.J. Davison and D.W. Murray, “Mobile Robot Localisation Using Active Vision,” Proc. Fifth European Conf. Computer Vision, pp. 809-825, 1998.
  8. [14] S. Se, D.G. Lowe, and J. Little, “Vision-based Mobile Robot Localization And Mapping using Scale-Invariant Features,” in Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2051–2058. May 2001.
  9. [15] L. Paz, P. Pini´es, J. Tard´os, and J. Neira, “ Large-Scale 6-DOF SLAM With Stereo-in-Hand,” IEEE Transactions on Robotics, 24(5), 2008.
  10. [17] R. Hartley, “In Defense of the Eight-Point Algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 19(6):580-593, 1997.
  11. [18] E. Kruppa, Zur Ermittlung eines Objektes aus Zwei Perspektiven mit Innerer Orientierung, Sitz.-Ber. Akad. Wiss., Wien, Math. Naturw. Kl., Abt. IIa., 122:1939-1948, 1913.
  12. [19] D.Nistér, “An efficient solution to the five-point relative pose problem,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26(6), 2004.
  13. [20] A. Fitzgibbon, A. Zisserman, “Automatic Camera Recovery for Closed or Open Image Sequences,” ECCV, 311-326, 1998.
  14. outdoor scene by hundreds-baseline stereo using a hand-held video camera,” International Journal of Computer Vision, 47(1-3):119–129, April 2002.
  15. [23] A. J. Davison, N. D. Molton, I. Reid, and O. Stasse, “MonoSLAM: Real-time single camera SLAM,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 29(6):1052–1067, 2007.
  16. [24] J. Civera, A. J. Davison, and J. M. M. Montiel, “Inverse depth parametrization for monocular SLAM,” IEEE Transactions on Robotics (T-RO), 24(5):932–945, 2008.
  17. [27] R. Hartley, A. Zisserman, “Multiple View Geometry in Computer Vision,” Cambridge University Press, 2000
  18. [29] J. Heikkila and O. Silven, “A Four-step Camera Calibration Procedure with Implicit Image Correction,” Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition, pp. 1106–1112, 1997.
  19. [30] Z. Zhang, “A Flexible New Technique for Camera Calibration,” IEEE Transactions on Pattern Analysis and Machine Intelligence, v.22 n.11, p.1330-1334, November 2000
  20. [31] D. G. Lowe. “ Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, IJCV, 60(2):91–110, 2004.
  21. [33] J. Shi and C. Tomasi, “Good Features to Track,” Proceedings of the Conference on Computer Vision and Pattern Recognition, pages 593–600, June 1994.
  22. [34] H. Bay , A. Ess , T. Tuytelaars , L. V. Gool, “Speeded-Up Robust Features (SURF), ” Computer Vision and Image Understanding, v.110 n.3, p.346-359, June, 2008
  23. [35] J.J. Koenderink, “The structure of images,” Biological Cybernetics, 50:363-396, 1984.
  24. [36] T. Lindeberg, “Scale-space theory: A basic tool for analysing structures at different scales,” Journal of Applied Statistics, 21(2):224-270, 1994.
  25. [37] K. Mikolajczyk and C. Schmid, “An affine invariant interest point detector,” European Conference on Computer Vision (ECCV), Copenhagen, Denmark, pp. 128-142, 2002.
  26. [3] J.A. Castellanos, J. Neira and J.D. Tardos. “Limits to the consistency of EKF-based SLAM”. 5th IFAC Symp. on Intelligent Autonomous Vehicles, IAV’04, Lisbon, Portugal, July 2004.
  27. [4] S. Huang and G. Dissanayake, “Convergence analysis for extended Kalman filter based SLAM,” in Proceedings of the IEEE International Conference on Robotics and Automation, May 2006, pp. 412–417.
  28. [7] M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit, “Fast-SLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges,” in Proc. Int. Joint Conf. Artificial Intelligence, 2003, pp. 1151–1156.
  29. [8] G. Grisetti, G. Tipaldi, C. Stachniss, W. Burgard, and D. Nardi, “Fast and accurate slam with Rao-blackwellized particle filters,” Robotics and Autonomous Systems, vol. 55, no. 1, pp. 30–38, 2007.
  30. [11] A. N¨uchter, H. Surmann, K. Lingemann, J. Hertzberg, S. Thrun, “6D-SLAM with an application in autonomous mine mapping,” in Proceedings of the IEEE International Conference on Robotics and Automation, 2004, pp. 1998–2003.
  31. [12] P. M. Newman, D. M. Cole, and K. Ho, “Outdoor SLAM using visual appearance and laser ranging,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2006.
  32. [16] J. Sola, A.Monin, andM. Devy, “BiCamSLAM: Two times mono is more 801 than stereo,” in Proceedings of the IEEE International Conference on Robotics and Automation, Italy, 2007, pp. 4795–4800.
  33. [21] T. Sato, M. Kanbara, N. Yokoya, and H. Takemura, “Dense 3-d reconstruction of an
  34. [22] M. Pollefeys , D. Nistér , J. -M. Frahm , A. Akbarzadeh , P. Mordohai , B. Clipp , C. Engels , D. Gallup , S. -J. Kim , P. Merrell , C. Salmi , S. Sinha , B. Talton , L. Wang , Q. Yang , H. Stewénius , R. Yang , G. Welch , H. Towles, “Detailed Real-Time Urban 3D Reconstruction from Video,” International Journal of Computer Vision, v.78 n.2-3, p.143-167, July 2008.
  35. [25] H. Strasdat, J. Montiel, and A. J. Davison, “Real-time monocular SLAM: Why filter?” in Int. Conf. on Robotics and Automation, Anckorage, AK, May 2010.
  36. [26] B Williams, I Reid , “On Combining Visual SLAM and Visual Odometry,” Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), May 2010.
  37. [28] Bouguet, J.Y. Camera, Calibration Toolbox for Matlab, http://www.vision.caltech.edu/bouguetj/calib_doc/index.html
  38. [32] C. Harris, and M. Stephens, “A Combined Corner and Edge Detection,” Proceedings of The Fourth Alvey Vision Conference, pages 147–151. 1988.
  39. [38] S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics ( Intelligent Robotics and Autonomous Agents). The MIT Press, September 2005. ISBN 0262201623