室內自動導航無人機系統之同步定位、地圖構建與影像物件偵測

多軸無人機在過去幾年，不論是業界抑或是學界皆已開始被廣泛地運用。業界上的利用例如貨物運輸，農業灌溉或是警務巡邏，而學界上則使用無人機進行資料自動蒐集，工地地圖重建，交通流量監測或是災害救援等等。雖然以上應用皆以顯現無人機良好的機動性，但目前無人機若須進行自動化移動，大多需要仰賴全球定位系統的協助以提供精準的定位。目前許多問題若是發生在全球定位系統訊號不良好甚至是室內環境時，就會導致無人機無法進行自動導航。另外，若是需要在未知的環境下使用無人機，通常需要專業的駕駛員操作無人機以避免無人機的墜毀。在這篇研究中，我們提出了一個完整的系統讓無人機可以快速建置一個擁有六個自由度的障礙物地圖。地圖的建置主要養賴彩色影像，深度影像以及相機的里程計。由障礙物底圖以及里程計，我們的無人機可以自動地在室內導航。在無人機探索室內後，我們的系統利用無人機蒐集的資料建立具有語義資訊的三維地圖。這篇研究提出新的深度模型基於Transformer的架構來將一段序列影像進行語義分割。我們更改了傳統Transformer的注意力機制，使其可以處裡計算量更大的時序性資料。我們進一步將語義分割文的資料重投影以建立三維地圖。經過我們的驗證比較，比起其他模型，我們的模型可以在相似的任務上提升準確度。

關鍵字

室內自動化無人機；深度學習；電腦視覺；時序性模型；語義分割

並列摘要

Micro Aerial Vehicle (MAV) has started to be utilized by different industries, and companies have used the MAV to deliver merchandise, agriculture spraying, or police patrolling. MAV also gained academic attention, and researchers have used the MAV to collect data for construction site map generation, traffic flow monitoring, or catastrophe rescuing tasks. While those applications have shown that the MAV can travel remotely and unmannedly, most applications highly rely on the Global Positioning System (GPS) to provide accurate position information, which is usually unavailable in an unstructured, crowded indoor environment. Furthermore, MAV usually needs to be controlled by well-trained professionals once it is placed in an unknown environment due to the lack of an environment map which might lead to the failure of autonomous navigation. In this research, we developed a MAV system that built a 6-DoF obstacle map by processing the sensor data from RGB-D and odometry camera to enable the drone autonomously navigate in an indoor environment. After MAV navigates through an unknown environment, it is important for the MAV to generate a semantic 3D map which not only helps the user to investigate the unknown environment but also enables the MAV to provide high-level navigation tasks. To generate the semantic map for MAV, we developed a new model based on Transformer architecture to process sequential data called Sequential-DDETR. Sequential-DDETR is an end-to-end model to generate a sequential segmentation image. We utilized the deformable attention model, which reduces computation significantly compared to the traditional Transformer. Our Sequential-DDETR can calculate attention features across different frames to enhance semantic segmentation performance on sequential images. We also utilized the depth image to perform back projection of sequence semantic segmentation masks to build a Semantic Simultaneous Localization and Mapping(Semantic-SLAM). We have shown that our model can perform better in building Semantic-SLAM than other methods.

並列關鍵字

Autonomous Indoor MAV ； Deep Learning ； Computer Vision ； Vision Transformer ； Semantic SLAM

參考文獻

References

Google Scholar

[1] Jaihyun Lee. Optimization of a modular drone delivery system. In 2017 annual IEEE international systems conference (SysCon), pages 1–8. IEEE, 2017.

Google Scholar

[2] Wongi S Na and Jongdae Baek. Impedance-based non-destructive testing method combined with unmanned aerial vehicle for structural health monitoring of civil infrastructures. Applied Sciences, 7(1):15, 2016.

Google Scholar

[3] MorganQuigley,KenConley,BrianGerkey,JoshFaust,TullyFoote,JeremyLeibs, Rob Wheeler, Andrew Y Ng, et al. Ros: an open-source robot operating system. In ICRA workshop on open source software, volume 3, page 5. Kobe, Japan, 2009.

Google Scholar

[4] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.

Google Scholar

主題瀏覽