透過您的圖書館登入
IP:18.226.165.247
  • 學位論文

適合自動駕駛車輛之結合邊緣資訊即時影像語意分割系統

Real-Time Semantic Segmentation with Edge Information for Autonomous Vehicles

指導教授 : 傅立成
共同指導教授 : 蕭培墉
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


先進駕駛輔助系統 (ADAS) 包含兩項基本功能需求。首先是物件偵測功能,用於車輛行駛中避免碰撞障礙物或是路上行人。另外一項需求則是藉由影像切割功能找到車輛可以行駛的區域。有別於傳統影像切割方法,採用語意分割的深度學習網路架構,可以更正確辨識不規則的道路區域,指引自駕車行駛在更複雜的道路環境中。 近年來隨著卷積神經網路(CNNs) 的普及化,其功能已超越傳統以人工找出特徵的影像分割方法。 但是,卷積神經網路(CNNs)架構複雜,需要更多的處理時間與硬體效能需求,對於實作於車載處理系統的即時運用,尚有困難需要克服。 目前有一些方法被提出,例如Enet,藉由刪除一些卷積層,達到更快執行速度,但卻犧牲影像切割的正確性。 本研究中,首先分析最先進的即時影像語意分割系統的輸出。 由這些輸出結果顯示,大多數被錯誤分類的像素,都是位於兩個相鄰物件的邊界上。基於此觀察,本研究提出一種新穎的即時影像語意分割網路系統,它包含一個類感知邊緣損失函數模塊與一個通道關注機制,旨在提高系統準確性而不損害運行速度。本研究以Cityscapes數據集評估所提出的方法,該資料集是目前公認最具挑戰性和權威性的道路語意分割數據集。評估結果顯示,在即時運作條件下,本研究的平均準確度超過 70%。

並列摘要


Advanced Driver Assistance Systems (ADAS) consists of two basic functions. One is the Object detection for preventing vehicles from hitting pedestrians or other obstacles. The other is image segmentation for recognizing drivable areas and guiding the vehicle forward. For the latter, unlike those traditional image segmentation methods, image semantic segmentation based on deep learning architecture can handle the road areas better, guiding a vehicle to drive in a more complex environment. With the popularity of Convolution Neural Networks (CNNs) in recent year, the traditional hand-crafted features methods have shown to be outperformed. However, deep CNN models are difficult to implement on vehicle application because the severe cost of time for complex processing. Although some proposed methods, such as Efficient neural network (Enet), achieved higher speed by removing some layers, it also led to the decrease of segmentation accuracy. In this research work, we first analyze the output of state-of-the-art real-time semantic segmentation networks. The result shows that most of the misclassified pixels are located on the edge between two classes. Based on this observation, we propose a novel semantic segmentation network which contains a class-aware edge loss module and a channel-wise attention mechanism, aiming to improve the accuracy with no harm to inference speed. We evaluate the proposed method on cityscapes dataset, which is the most challenging and authoritative on-road semantic segmentation dataset. The results show that our proposed method can achieve over 70% mean IOU on Cityscapes test set under real-time requirements.

參考文獻


[1] P. C. Ng and S. Henikoff, "SIFT: Predicting amino acid changes that affect protein function," Nucleic acids research, vol. 31, pp. 3812-3814, 2003.
[2] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 886-893, 2005.
[3] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248-255, 2009.
[4] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, "The pascal visual object classes (voc) challenge," International journal of computer vision, vol. 88, pp. 303-338, 2010.
[5] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The cityscapes dataset for semantic urban scene understanding," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213-3223, 2016.

延伸閱讀