先進深度演算法在醫學、生物以及一般影像處理上的應用

隨著硬體技術的躍進以及進入大數據資料的時代，近五年影像處理技術也因為深度學習方法的突破而有了跨時代的重大改變。因此無論是在各個不同的影像處理領域(包含影像去雜訊，影像去模糊，影像偵測辨識，影像前處理和影像後處理)文章都如火如荼地提出許多先進的改善方法。雖然深度學習在電腦視覺和影像處理已成為一個熱門的研究方向，但在其中仍有改善的空間。例如深度學習可視為一種資料學習的方式，而正常來說我們無法預期訓練資料是完美無缺，進而影響到模型的結果不準確問題。但傳統方法機器學習的方法卻有保留了人對影像物體結構特性的直覺性判斷跟處理。基於此原則，我們可以充分應用影像中的區域特徵來開發影像。本論文提出了三個獨立方法的改善在不同領域上的研究成果，分別為影像切割、物體偵測和物體辨識。在影像切割上主要提出如何處理雜訊影響的眼睛血管的分割問題，由於大部分分割問題都是針對日常物體，且具有相對明顯的「影像語意」。然而在處理光學共軛斷層血管掃描圖片(OCTA)為非常細微且模糊的視網膜血管圖像，並沒有有效的預訓練模型以及龐大的資料去幫助醫生如何完成資料不足且細微血管影像的干擾問題。有鑒於此，我們與長庚醫院合作設計了一個全新的資料庫且提出一套結合了深度學習跟機器學習的優點方法來讓我們的成果可以達到實際應用。在物體偵測上我們提出了如何改善小物體在大場景下的追蹤問題。在諸多物體偵測的方法提出，其最有名的方法無外乎為單層偵測(Yolo system)跟雙層偵測(Faster R-cnn)，這些方法是建立在乾淨且特徵明顯的物體上去做到高效率的成果。然在處理物體小且背景雜訊干擾狀態下的影像其效果會有所侷限，因此我們與鯨豚實驗室合作提出一套系統性的方案去解決實際人員整理大量遠距照片影像切割辨識的問題。最後在物體辨識上我們研究並提出了如何使用多重分類器去學習更有效率的合併傳統單一CNN的限制問題在辨識資料庫。CNN架構用於辨識上是一個非常好的分類器，然而單獨使用單一模型的辨識結果其成效會侷限於在成功率以及損失來抉擇出最好的模型。然而，我們提出的方法可以非常有效率的提高原先系統的成果。這三個問題分別都是影像上不可避免的問題。此論文針對這些問題在各個真實場景的應用上，藉由改善模型的主要架構來達到更有效率且準確的結果。本論文的內容包含探討資料學習方法的缺點，訓練模型的缺點以及如何改善這些缺失並且融合之前傳統機器學的方法，以便更加和人類視覺以及算法的改善

關鍵字

物體偵測；語意切割；神經網路；影像視覺；保育；去雜訊；光學共軛斷層血管圖；視網膜微血管

並列摘要

Along with upgraded hardware equipment and the tendency of big data, image processing algorithms have achieved significant improvement due to the application of deep learning in the past five years. In every image processing field, such as noise reduction, image deburring, pre-processing, and post-processing systems, deep learning has been explored in full swing. Although deep learning is an inevitable trend in computer vision and image processing, there is still room for improvement, especially inexactitude and accuracy. Namely, deep learning is also called data learning, mostly dependent on the completeness and wholistic of the database. However, the comprehensive database is usually not unattainable during handling real-life images, which takes many resources and time in human labeling. On the contrary, the traditional handcraft method, which depends on analyze and observe the statistical of object information, can always retain features of the image corresponding to the human experience. Therefore, this thesis focuses on extracting the object's local pattern and develops a novel algorithm to compensate for the weakness in deep learning in three different areas, including detection, segmentation, and classification. The first is segmentation. Handling noise interference and highlighting features of an object are our main contributions. Most of the related works focusing on semantic segmentation depend on a useful pre-trained model. However, our optical coherence tomography angiography (OCTA) medical images have complicated small and blurred vessel topology in a limited dataset, lacking an applicable pre-trained model. Therefore, in this work, we collaborated with Chang Gung Medical Hospital (CGMH) to develop the newest database and combine both advantages of traditional machine learning and deep learning to achieve real-life clinical application. The second direction is detection. This work proposes a systematic solution to detect small objects in a massive scene. Although there are many mature and state-of-the-art algorithms such as Faster RCNN and YOLO, they are insufficient in the super-resolution problems in remote shooting images, including small target, camera shaking, and intense light pollution. Designing a perfect system to acquire significant detection and classification features in remote-shooting images in wild scenes is priceless. Therefore, we collaborated with the Lab of Cetacean in NTU, collecting and labeling far-shooting photos of wild dolphins for more than ten years and can provide a vast and valuable database. Last but not least, for classification, we investigated and proposed an ensemble algorithm on facial emotion recognition to boost the performance of traditional CNN from several classification models, involving how to choose the best model from many functions, loss, and objective criteria rules. These contributions are significant and pioneering as we have solved the inevitable and general image processing problems. Our achievements were adopted by other teams, such as doctors and biologists. The proposed deep database learning architecture effectively integrates the merits of various image processing and machine learning methods.

並列關鍵字

object detection ； sematic segmentation ； convolutional neural networks ； computer vision ； conservation ； noise reduction ； optical coherence tomography ； retinal vasculature

參考文獻

[Chapter 1-3]

Google Scholar

[1] Jason Brownlee, “Photograph of Three Zebra Each Detected with the YOLOv3 Model and Localized with-Bounding-Boxes,” May 2019, https://machinelearningmastery.com/how-to-perform-object-detection-with-yolov3-in-keras/.

Google Scholar

[2] T. Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117-2125, 2017.

Google Scholar

[3] Adrian Rosebrock, “Non-Maximum Suppression for Object Detection in Python,” November 2014, https://www.pyimagesearch.com/2014/11/17/non-maximum-suppression-object-detection-python/

Google Scholar

[4] Jeremy Jordan, “An overview of semantic image segmentation,” May 2018, https://www.jeremyjordan.me/semantic-segmentation/.

Google Scholar

國際替代計量

先進深度演算法在醫學、生物以及一般影像處理上的應用

未授權

主題瀏覽