透過您的圖書館登入
IP:3.144.229.52
  • 學位論文

使用CLAHE預處理和改良式YOLOv5的水下物件偵測

Underwater Object Detection Using CLAHE Preprocessing And Improved YOLOv5

指導教授 : 許輝煌

摘要


深度學習在機器學習領域當中佔據了其中非常重要的一環,而深度學習指的是透過模擬人腦神經所構造出來的神經網路架構,在資料資源以及計算資源急劇提升的世代,深度學習也因此得到了日新月異的發展,而在深度學習的應用當中,物件偵測是深度學習領域非常重要的一種應用。物件偵測的意思是在圖像或是影像當中辨識以及定位目標物件,優秀的物件偵測模型不但能判斷出圖像當中有哪些物件還要精準標識出物件的位置,而物件偵測之所以這麼重要是因為能有效地運用在各大應用上面,例如:攝影監控、圖像檢索、自動駕駛、機器人操作等。早期的物件偵測依賴傳統的圖像處理以及機器學習技術,例如:霍夫轉換(Hough Transform)、支持向量機(Support Vector Machines)。 這些傳統的物件偵測方法雖然能成功偵測目標,但在面對複雜場景以及大規模資料的情況,其偵測能力以及速度會明顯下降許多。而在深度學習的發展下,O’Shea, K等人設計出了卷積神經網路(Convolutional Neural Networks, CNNs),CNNs在圖像處理領域當中表示了強大的特徵提取能力,物件偵測領域也從此開始大幅度 iii 的進步,許多研究學者開始將CNNs加入至物件偵測模型當中,其優異的表現使得出現了一系列基於深度學習的物件偵測演算法,例如:R-CNN、Fast R-CNN、YOLO(You Only Look Once)等。 這些基於深度學習的物件偵測演算法,能夠自動學習到圖像中複雜的特徵表現,且與傳統方法相比也體現出更高的偵測精準度和速度。而其中,YOLO系列更是表現出即時偵測的特性,且廣泛應用在各大領域當中。 水下物件偵測是近年來物件偵測領域當中的一大挑戰,水下物件偵測常遇到影像模糊、色彩失真以及對比度低等影響,再加上水下物件通常體積較小,故會發生水下物件偵測不良的問題。本論文設計了一個基於CLAHE預處理和改良式YOLOv5s的網路模型,接著首先圖像預處理的部分使用CLAHE演算法來增強圖像對比度,使對比度低和顏色失真的問題有所改善。接著使用CA注意力機制模塊加入到YOLOv5s的骨幹層,目的是增強關注物件特徵的學習能力。最後在預測層加入自適應空間特徵融合模塊(ASFF),目的是增強網路模型在不同尺度下對關注物件的感知能力,有助於提高整體物件偵測的效能以及偵測指標。最後實驗結果的部分,本論文採用了精確度(Precision)、召回率(Recall)、平均精確度(mAp@0.5)、平均精確度(mAp@0.5-0.95)作為評量指標,並與其他水下偵測模型進行比較以及進行消融實驗來評斷單獨模塊的能力,實驗結果表示最優的表現分別85.1%、87.6%、90.1%、66.9%,與其他水下偵測模型相比證明此模型是有效的,而透過消融實驗結果表示加入的模塊是可以處理水下物件偵測所遇到影像模糊、色彩失真以及對比度低的問題,並提高水下物件偵測的精確度。

並列摘要


Deep learning plays a very important role in the field of machine learning, and it refers to the neural network architecture constructed by simulating human brain neurons. In the era of rapidly increasing data and computing resources, deep learning has also made rapid progress. In the application of deep learning, object detection is a very important application in the field of deep learning. The meaning of object detection is to identify and locate target objects in an image or image. Excellent object detection models can not only determine which objects are in the image but also accurately identify their positions. The reason why object detection is so important is because it can be effectively applied in various applications, such as photography monitoring, image retrieval, autonomous driving, robot operation, etc. Early object detection relied on traditional image processing and machine learning techniques, such as Hough Transform and Support Vector Machines. Although these traditional object detection methods can successfully detect targets, their detection ability and speed will significantly decrease in the face of complex scenes and large-scale data. With the development of deep learning, O'Shea, K et al. designed Convolutional Neural Networks, CNNs represent powerful feature extraction capabilities in the field of image processing, and object detection has made significant progress since then. Many researchers have begun to incorporate CNNs into object detection models, and their excellent performance has led to a series of deep learning based object detection algorithms, such as R-CNN Fast R-CNN, YOLO (You Only Look Once), etc. These deep learning based object detection algorithms can automatically learn complex feature representations in images and demonstrate higher detection accuracy and speed compared to traditional methods. And among them, The YOLO series exhibits real-time detection characteristics and is widely used in various fields. Underwater object detection has become a major challenge in the field of object detection in recent years. Underwater object detection often encounters issues such as image blurring, color distortion, and low contrast. In addition, underwater objects are usually small in size, which can lead to poor detection of underwater objects. This thesis designs a network model based on CLAHE preprocessing and improved YOLOv5s. Firstly, the image preprocessing part uses CLAHE algorithm to enhance image contrast, which improves the problems of low contrast and color distortion. Next, the CA attention mechanism module is added to the backbone layer of YOLOv5s to enhance the learning ability of object features. Finally, an adaptive spatial feature fusion module (ASFF) is added to the prediction layer to enhance the network model's perception ability of objects of interest at different scales, which helps to improve the overall performance of object detection and detection metrics. In the final experimental results section, this thesis adopts Precision, Recall, and Average Precision( mAp@0.5 )Average accuracy( mAp@0.5-0.95 )As an evaluation indicator, and compared with other underwater detection models and conducted ablation experiments to evaluate the ability of individual modules, the experimental results showed that the optimal performance was 85.1%, 87.6%, 90.1%, and 66.9%, respectively. Compared with other underwater detection models, this model proved to be effective. The results of ablation experiments showed that the added module can handle the problems of image blur, color distortion, and low contrast encountered in underwater object detection, and improve the accuracy of underwater object detection.

參考文獻


[1] Hinton, G. E., Osindero, S., Teh, Y.-W. (2006). A Fast Learning Algorithm for Deep Belief Nets. Neural Computation, 18(7), 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527
[2] Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A. (2015). You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 779-788.
[3] Jocher, G. (2020). YOLOv5 by Ultralytics (Version 7.0) [Computer software]. https://doi.org/10.5281/zenodo.3908559 https://github.com/ultralytics/YOLOv5
[4] S. M. Pizer, R. E. Johnston, J. P. Ericksen, B. C. Yankaskas and K. E. Muller,(1990). "Contrast-limited adaptive histogram equalization: speed and effectiveness," Proceedings of the First Conference on Visualization in Biomedical Computing, Atlanta,GA,USA,pp. 337-345, doi:10.1109/VBC.1990.109340.
[5] Hou, Q., Zhou, D., Feng, J. (2021). Coordinate Attention for Efficient Mobile Network Design. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 13708-13717.

延伸閱讀