透過您的圖書館登入
IP:18.189.170.17
  • 學位論文

利用同步定位與地圖構建及深層立體匹配的模型重建

Model Reconstruction Using Visual SLAM and Deep Stereo Matching

指導教授 : 陳永昇

摘要


對環境進行三維立體模型重建在電腦視覺領域裡一直都是很重要的研究主 題。早期的研究經常會使用雷射測距儀作為主要感測器,雖然他們能夠量測絕對 距離,但是結果通常都是稀疏的點雲。近年來,隨著科技進步,相機變得更加便 宜、輕巧,並且能夠取得高品質的影像。再者,相機屬於被動傳感器的一種,它 們不會干擾環境及其他傳感器,因此單純依靠影像的重建方法越來越受歡迎。本 篇論文提出一套三維環境重建系統,使用立體相機作為感測器,搭配同步地圖構 建與定位、深層立體匹配網路的技術,能夠達到在對環境計算稠密點雲時,讓使 用者進行同步場景瀏覽;更進一步地對點雲進行後處理,以達到三角網格模型重 建。 在此系統裡,首先必須先對立體相機進行校正,以得到內部和外部參數。下 一步是利用相機參數與立體相機所取得之立體影像對來進行相機定位、稠密點雲 重建及同步瀏覽。最後則是對系統產生的場景稠密點雲進行整合,然後重建其網 格模型。實驗結果顯示,本系統能夠以合理的速度進行稠密點雲重建。我們透過 KITTI 資料庫來演示此系統,並且展現了它能夠拓展至實際用途的潛力。

並列摘要


Three-dimensional reconstruction of the environment has always been an important subject in computer vision and robotics. Laser rangefinders, such as LIDAR, were often used to acquire range data in earlier works. Although they can measure distance in metric unit, the acquired data are usually sparse. In recent years, cameras become more inexpensive, compact, and can acquire high quality results. Moreover, they are passive sensors that do not interfere with both the scene and other sensors, and have virtually unlimited measurement range. Therefore, pure vision-based approaches gain significant interests. We present a system capable of dense 3D reconstruction and live scene navigation, and a surface reconstruction can be accomplished by post-processing. We use a stereo camera as the sensor, combining visual SLAM and deep stereo matching network to achieve our goal. In the proposed system, a stereo rig has to be calibrated in order to get the intrinsic and extrinsic parameters. Next, with the calibrated stereo camera, the live SLAM system estimates the camera trajectory and generates dense point clouds of the scene. It also visualizes the reconstruction result on the fly. After the live SLAM system terminates, the produced dense point clouds can be integrated by their camera poses, and reconstructed into triangle mesh model. From the experimental results, the system can achieve a live dense 3D reconstruction at an acceptable frame rate. We demonstrate the system with KITTI benchmark dataset and show the potential for extending it to practical cases.

參考文獻


[1] C. Wu, “Towards linear-time incremental structure from motion,” in 3D Vision3DV
2013, 2013 International Conference on, pp. 127–134, IEEE, 2013.
[2] A. J. Davison, I. D. Reid, N. D. Molton, and O. Stasse, “Monoslam: Realtime
single camera slam,” IEEE Transactions on Pattern Analysis & Machine
Intelligence, no. 6, pp. 1052–1067, 2007.

延伸閱讀