一般來說,同步定位與地圖構建(SLAM)演算法都假設是在靜態環境下運行,然而在現實世界中往往充滿行進的車輛與走動的人。如此嚴格的假設限制了其可用性,尤其是在機器人或自動駕駛車輛上進行。 在本篇論文,我們提出一個具備語義的視覺定位系統,能夠使用深度或立體相機,在高度動態的場景進行相機自我定位。本系統基於ORB-SLAM2,搭配目前表現佳且快速的物體偵測方法取得語義資訊。我們運用兩個公開的資料集來評估系統表現。此系統計算比DynaSLAM快,同時達到相近的定位準確度。對運算時間的分析也一併呈現在論文中。
Typically, simultaneous localization and mapping (SLAM) algorithms are assumed to be performed in the stationary environments only. However, there are ordinarily moving cars and people in the real world. Such a strict assumption restricts its usability especially on the robots or autonomous vehicles. In this work, we present a semantic SLAM for RGB-D and stereo cameras, which can deal with highly dynamic scenes. The visual SLAM system is built on ORB-SLAM2 and the semantic information is acquired from state-of-the-art object detection with high frame rate. Experiments are conducted on two public benchmarks. Our system is faster than DynaSLAM and keeps the similar localization accuracy. The analysis of the computation time is also presented.