透過您的圖書館登入
IP:18.218.38.125
  • 學位論文

深層凸型非負矩陣拆解結合深度卷積神經網路之道路障礙物偵測

Deep Convex-Nonnegative Matrix Factorization Integrated with Deep Convolutional Networks for On-Road Obstacle Detection

指導教授 : 傅立成
共同指導教授 : 蕭培墉(Pei-Yung Hsiao)

摘要


近年來,由於因車禍而造成死亡或受傷的比例仍是居高不下。智慧車輛的發展正在逐步加速,其中的議題牽涉廣闊,包含了定位系統、節能輔助、車輛防撞甚或自動駕駛等服務。車輛防撞是目前安全系統相當重視的區塊,同時也是自動駕駛系統必備之技術(Advance Driver Assistance System, ADAS)。車輛防撞系統仰賴不同的偵測器進行環境感知的偵測,其中以影像擷取單位對於道路狀況與信號號誌進行辨識是重要的信息來源。以影像進行障礙物偵測的發展已經有一段時日,但在傳統的方法上,其效果仍有無法突破之空間。為了突破此限制達到更精準的偵測效果對於近兩年開始流行的深度學習(Deep Learning),對於大量的資料中擁有更強學習力的方法便被引入,以便能在影像擷取系統中找尋更為豐富的資訊。 本論文即是針對影像採用卷積神經網路,用以將深度學習實現於影像偵測之中,藉由卷積神經網路擁有學習大量資料中統計規律的特性,對我們最密切關注的四種偵測類別行人、汽車、腳踏車騎士以及摩托車騎士學習個別之特徵描述。並且藉由此論文提出之深層凸型非負矩陣拆解,對於不同類別之物體訓練多層之基底和係數矩陣,來提升卷積神經網路之偵測效果。 為了驗證本論文的方法,我們會在同為行車道路場景的KITTI資料集進行實驗以及知名的行人資料集INRIA,效果分別能達到79%和91%AP。此外,我們也有自行拍攝校園以及市區之行車道路場景,效果能達到95%以上Recall/Precision。

並列摘要


Due to the fact that the number of on-road accidents increases over years, developing an advanced driver assistance system (ADAS) is getting to be critical. The ADAS is a system which applies advanced computer technologies to alert drivers at the appropriate timing to minimize the possibilities of accidents. The most essential part of ADAS is to detect any on-road obstacles through captured visual images that may jeopardize the running host vehicle especially from its front. In this thesis, we propose a novel deep learning framework which incorporates our proposed Deep Convex-Non-negative Matrix Factorization (DC-NMF) technique to process the camera images for obstacle detection. Besides this, our proposed novel model, called Deep Convex-NMF (DC-NMF), which helps one to learn more sophisticated bases that represent the original high dimensional features. Logically, we first use this aforementioned model to extract multilayer basis matrix and then use it to improve the detection performance of the proposed novel deep learning framework, or called Deep Convex-NMF Net (DC-NMF Net). To validate the proposed work, we evaluate the AP of our proposed method on KITTI and INRIA dataset, and we find that the respective quantitative performances are 79% and 91%. We also establish our own urban scene dataset and test the performance of our method on it which turns out to be able to achieve 95% recall/precision.

參考文獻


[29] G. Casalino, N. D. Buono, and M. Minervini, "Nonnegative matrix factorizations performing object detection and localization," Applied Computational Intelligence and Soft Computing, vol. 2012, p. 15, 2012.
[8] G. Griffin, A. Holub, and P. Perona, "Caltech-256 object category dataset," 2007.
[2] P. Viola, M. J. Jones, and D. Snow, "Detecting pedestrians using patterns of motion and appearance," International Journal of Computer Vision, vol. 63, pp. 153-161, 2005.
[3] T. Lindeberg, "Scale invariant feature transform," Scholarpedia, vol. 7, p. 10491, 2012.
[6] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, "Object detection with discriminatively trained part-based models," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 32, pp. 1627-1645, 2010.

延伸閱讀