透過您的圖書館登入
IP:13.58.39.129
  • 學位論文

使用特徵加權機制調整特徵相關性的影像定位

Image-based Localization using Feature-Weighted Mechanism for Adjusting Feature Correlation

指導教授 : 洪一平

摘要


利用深度學習機制來達到以影像為基礎的定位是近幾年定位研究的趨勢,這是由於深度學習架構可以利用圖像處理器平行化之後快速運行,來達到實時的效果,同時不需要隨著時間而消耗更多的記憶體資源,只需要讓固定大小的深度模型去認知一個場景的內容,就可以用深度模型中的卷積層去模擬傳統定位的幾何運算。在本論文中,我們主要開發一個端到端的定位系統,透過整合特徵加權機制以及長短期記憶模型,並搭配可以自動學習位置與角度之間的尺度比重的損失函數,來達到好的定位效果。在此之後,我們呈現了此系統在室外資料集以及室內資料集的定位表現,並將結果跟經典的兩個深度學習定位系統做比較。其結果顯示,在中位數誤差表現上,我們不管在室外資料集或是室內資料集上都比前兩者還要更進步,同時我們的誤差軌跡圖也顯示出我們有不錯的提升。最後,我們的運行速度也能保持在平均每秒九十幀以上,是足夠做實時運行的。

並列摘要


The use of deep learning mechanisms to achieve image-based localization is the trend of localization research in recent years. This is because the deep learning architecture can run more quickly after parallelization of GPU to achieve real-time running, and does not need to consume more memory resources over time. It only needs a fixed-size model to recognize the contents of a scene, and simulates the traditional localization geometric operations by the aid of the convolutional layers in the model. In this paper, we mainly develop an end-to-end localization system, which integrates the feature-weighted mechanism and long short-term memory models, and uses a loss function that can automatically learns the scale weights between position and rotation to achieve a good localization result. After this, we presented the localization performance of this system in outdoor dataset and indoor dataset, and compare the results with two classic deep learning localization system. The results show that in terms of the median error performance, we are more advanced than the former in both outdoor and indoor. At the same time, our error trajectory shows that we have a good improvement. Finally, our running speed can also reach more than ninety fps on average, which is sufficient for real-time operation.

參考文獻


[1] A. Kendall, M. Grimes, and R. Cipolla, "Posenet: A convolutional network for real-time 6-dof camera relocalization," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 2938-2946.
[2] F. Walch, C. Hazirbas, L. Leal-Taixe, T. Sattler, S. Hilsenbeck, and D. Cremers, "Image-based localization using lstms for structured feature correlation," in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 627-637.
[3] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[4] S. Brahmbhatt, J. Gu, K. Kim, J. Hays, and J. Kautz, "Geometry-aware learning of maps for camera localization," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2616-2625.
[5] Z. Laskar, I. Melekhov, S. Kalia, and J. Kannala, "Camera relocalization by computing pairwise relative poses using convolutional neural network," in Proceedings of the IEEE International Conference on Computer Vision Workshops, 2017, pp. 929-938.

延伸閱讀