基於多幀點雲對齊的三維物體偵測增強

三維物件偵測是自動駕駛系統中不可或缺的組件，負責定位和分類由感測器收集的點雲或深度影像中的物件。這項任務使得自動駕駛車輛和路邊單元能夠有效地感知其周圍環境。此外，後續的決策任務大量依賴於三維物件偵測的結果，因此其準確性直接影響自動駕駛系統的性能和安全性。近年來，大多數關於三維物件偵測的研究都利用深度神經網絡，需要標註的數據集來訓練模型。然而，在點雲中標註物件邊界框是一項耗時且具有挑戰性的任務。光達受到遮擋影響，只能提供環境的部分點雲。人工標註者發現很難在沒有其他感測器輔助的情況下為部分點雲標註完整的邊界框。在這份工作中，我們提出了一個點雲對齊流程，可以對齊稀疏的車輛點雲而無需任何標註數據，並從聚合的點雲中生成邊界框。我們的流程利用車輛的輪廓進行對齊，解決了基於特徵的配準方法難以解決的稀疏點雲對齊挑戰。該流程包括一個邊界框估算器，用於生成粗略的邊界框，基於這些粗略邊界框進行初始對齊，並結合點對點和面對面方法進行點雲配準。實驗結果顯示，我們的方法改善了邊界框的品質。在IoU閾值為0.7時，召回率提高了10%，而且在平移誤差和旋轉誤差方面也勝過基於特徵的配準方法。

關鍵字

三維物件偵測；點雲配準；多幀點雲；自駕車；無監督學習

並列摘要

3D object detection is an essential component of autonomous driving systems, responsible for localizing and classifying the objects within the point clouds or depth images collected by sensors. This task enables self-driving vehicles and roadside units to effectively perceive their environment. Moreover, the subsequent tasks such as decision-making heavily relying on the results of 3D object detection. Therefore, the accuracy of 3D object detection directly influences the performance and safety of autonomous driving systems. Recently, most works on 3D object detection leverage deep neural networks, requiring annotated datasets to train models. However, annotating object bounding boxes in point clouds is a time-consuming and challenging task. LiDAR is affected by occlusions and can only provide partial views of the environment. Human annotators find it difficult to label complete bounding boxes for partial point clouds without the assistance of other sensors. In this work, we propose a point cloud alignment pipeline that can align sparse vehicle point clouds without requiring any annotated data and generate bounding boxes from aggregated point cloud. Our pipeline uses the vehicle's contour for alignment, addressing the sparse point cloud alignment challenge that feature-based registration methods struggle to solve. The pipeline comprises a bounding box estimator for generating rough bounding boxes, initial alignment based on these rough bounding boxes, and point cloud registration combining point-to-point and plane-to-plane methods. The experimental results show that our method improves the quality of bounding boxes. It achieves a 10% increase in recall at an IoU threshold of 0.7, and outperforms feature-based registration methods in terms of translation error and rotation error as well.

並列關鍵字

3D Object Detection ； Point Cloud Registraion ； Multi-frame Point Clouds ； Autonomous Vehicles ； Unsupervised Learning

參考文獻

A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, and O. Beijbom, “Pointpillars: Fast encoders for object detection from point clouds,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 697–12 705.

Google Scholar

Y. Zhou and O. Tuzel, “Voxelnet: End-to-end learning for point cloud based 3d object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4490–4499.

Google Scholar

Y. Yan, Y. Mao, and B. Li, “Second: Sparsely embedded convolutional detection,”Sensors, vol. 18, no. 10, p. 3337, 2018.

Google Scholar

S. Shi, X. Wang, and H. Li, “Pointrcnn: 3d object proposal generation and detection from point cloud,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 770–779.

Google Scholar

W. Shi and R. Rajkumar, “Point-gnn: Graph neural network for 3d object detection in a point cloud,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 1711–1719.

Google Scholar

延伸閱讀

全文下載

主題瀏覽