  • 學位論文


Real-time object detection with applications to underwater recreational activities

指導教授 : 廖文宏


本論文試圖在一般相機所拍攝之水下影像,利用顏色修正技術,讓目前已存在之影像資料集,經過訓練後,可被用來偵測水下影像。 本研究基於深度學習概念,使用屬於生成對抗網路之pix2pix network,透過控制loss function/ iteration/ 資料分群等方式,分析及評估各種參數調校,將水下影像進行修正,成為如水面上拍攝之影像。此外,藉由遷移學習概念來訓練模組,分析各種物件之AP及整體mAP,達成水下即時偵測物件的需求。 評估及測試不同模型與調整參數,得到最佳結果為:Fish AP為0.71、Jellyfish AP為0.72及Diver AP為0.39,而整體 mAP則為0.606,相同條件下與未經影像修正相比,mAP大幅提高了50.3%。期許此色彩修正及偵測系統,讓水下活動人員進行各項休閒之時,即時被面鏡所限制的有限視野,也能藉著水下相機的架設,迅速偵測出視野內外所需之物件位置及資訊,在有限時間內,增加水下活動的效益。


This thesis attempts to employ color correction techniques to restore underwater images so that object detection models trained with existing image datasets can be used to cope with underwater images without extensive retraining. Based on the concept of deep learning, this study uses pix2pix network, a variant of generative adversarial network (GAN), to enhance the color of underwater images. We analyze and evaluate the efficacy of restoration by exploring different combinations of loss function/ iteration/ data grouping. The object detection model is trained using transfer learning technique, and average precision (AP) and overall mAP are analyzed to meet the requirements of underwater activities. Experimental results indicate that the AP for Fish is 0.71, the AP for jellyfish is 0.72, and the AP for diver is 0.39, with an overall mAP of 0.606, demonstrating a remarkable 50.3% improvement when color correction is applied. It is expected that users can quickly identify the position and information of objects of interest within the field of view limited by the mask through the system, thereby enhancing the experience of underwater activities.


[1] 李明儒, et al. "休閒潛水者對潛水風險的認知與損害之研究." 運動與遊憩研究 1.3 (2007): 14-33.
[2] 李凡, et al. "水下声传播的发展及其应用." 物理 43.10 (2014): 658-666.
[3] Klemm et al., “Exploring Our Fluid Earth is based on the nationally recognized Fluid Earth/Living Ocean (FELO) aquatic science curriculum.” 1995.
[4] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
[5] Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.
