透過您的圖書館登入
IP:18.221.54.244
  • 學位論文

改進YOLO模型應用於自動駕駛車道路目標檢測

Improving YOLO Models for Road Object Detection in Autonomous Driving

指導教授 : 陳永隆
共同指導教授 : 黃馨逸(Hsin-I Huang)
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


自動駕駛技術近年來受到了相當多的關注,即時物件偵測是其感知系統的重要組成。在開發安全高效的自動駕駛汽車中,其中一個主要挑戰在於準確且即時地在各種複雜的交通環境中偵測物件,且由於車輛的高速運動及周圍環境的複雜變化,需要偵測不同尺度的物件,這對網絡模型的性能提出了相當高的要求。此外,不同的駕駛設備具有不同的性能能力,需要輕量級模型來確保計算資源有限的設備穩定運行,YOLOv8在偵測物件方面表現出了強大的性能。然而,該算法需要改進以有效處理複雜交通中物件偵測系統的挑戰,例如多樣的物件分類、小尺度物件、快速移動的物件、模糊、眩光和低光照,尤其是在夜間情況下。為了應對這些問題,我們提出了兩個方法:本文提出的第一個方法為YOLOv8s with Spaital Pyramid Pooling Fast Cross-Stage Partial Channel Model (YOLOv8s-SPPFCSPC) 方法,透過在原始YOLOv8架構中的Spatial Pyramid Pooling-Fast (SPPF) 方法添加了Spatial Pyramid Pooling Cross-Stage Partial Channel(SPPCSPC)方法來讓其能夠解決提取重複特徵的問題並增強了網絡的泛化能力;本文提出的第二個方法為YOLOv8s with Spaital Pyramid Pooling Fast Cross-Stage Partial Channel and Squeeze-and-Excitation Model (YOLOv8s-SPPFCSPC SE)方法,我們加入Squeeze and Excitation (SE)注意力機制來幫助模型去捕捉空間相關性,並透過通道的重新校正來提升特徵萃取的能力。我們預期透過融合了SPPFCSPC和加入SE注意力機制來使模型輕量化並且效能更好,比較於YOLOv8模型預計精確度和召回率的表現會更好。

並列摘要


Object detection in autonomous driving has garnered significant attention in recent years. Real-time object detection is a critical component of its perception system. In developing safe and efficient self-driving cars, one of the primary challenges is the accurate and real-time detection of objects in diverse and complex traffic environments. Due to the high-speed movement of vehicles and the complex changes in the surrounding environment, there is a demand for detecting objects of varying scales, which places high demands on the performance of network models. Additionally, different driving devices have varying performance capabilities, necessitating lightweight models to ensure the stable operation of devices with limited computing resources. YOLOv8 demonstrates robust performance in object detection. However, the algorithm requires improvement to effectively handle the challenges of object detection systems in complex traffic scenarios, such as diverse object classifications, small-scale objects, fast-moving objects, blur, glare, and low-light conditions, especially at night. To address these issues, we propose two methods: The first method, YOLOv8 with Spatial Pyramid Pooling Fast Cross-Stage Partial Channel (YOLOv8-SPPFCSPC) method. Incorporating the Spatial Pyramid Pooling Cross-Stage Partial Channel (SPPCSPC) method into the original YOLOv8 architecture's Spatial Pyramid Pooling-Fast (SPPF) method aims to address the extraction of redundant features and enhance the network's generalization ability. The second method, YOLOv8 with Spatial Pyramid Pooling Fast Cross-Stage Partial Channel and Squeeze-and-Excitation (YOLOv8-SPPFCSPC-SE) method, Integrates the Squeeze and Excitation (SE) attention mechanism to help the model capture spatial correlations and enhance feature extraction capabilities through channel recalibration. We anticipate that by combining the SPPFCSPC method with the SE attention mechanism, the model will become lighter and perform better. Compared to the YOLOv8 model, the performance in terms of precision and recall is expected to improve.

參考文獻


[1] E. Yurtsever, J. Lambert, J. Carballo, and K. Takeda, "A Survey of Autonomous Driving: Common Practices and Emerging Technologies," IEEE Access, vol. 8, pp. 58443-58469, 2020.
[2] L. Liao, B. Li, F. Zou, and D. Huang, "MFGCN: A Multimodal Fusion Graph Convolutional Network for Online Car-hailing Demand Prediction," IEEE Intelligent Systems, vol.38, no.3, pp.21-30, 2023.
[3] K. Muhammad, A. Ullah, J. Lloret, J. Del Ser, and V. H. C. de Albuquerque, "Deep Learning for Safe Autonomous Driving: Current Challenges and Future Directions," IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 7, pp. 4316-4336, 2020.
[4] Z. Q. Zhao, P. Zheng, S. T. Xu, and X. Wu, "Object Detection with Deep Learning: A Review," IEEE transactions on neural networks and learning systems, vol. 30, no. 11, pp. 3212-3232, 2019.
[5] A. Afdhal, N. Nasaruddin, Z. Fuadi, S. Sugiarto, H. Riza, and K. Saddami, "Evaluation of Benchmarking Pre-trained Cnn Model for Autonomous Vehicles Object Detection in Mixed Traffic," in Proceedings of 2022 International Conference on ICT for Smart Society (ICISS), 2022, pp. 01-06: IEEE.

延伸閱讀