基於非監督式跨領域深度學習之單張影像雜訊去除

數位多媒體資料於我們的日常生活中已無所不在，尤其以影像及視訊資料為大宗，例如：隨時隨地皆有無法計數之影像資料來自各類行動裝置及無所不在之路邊監視器。這些龐大之影像資訊可能帶來日常生活中大量的應用。然而，影像資料之來源非常廣泛且品質難以控制。影像品質過低可能會使其相關應用之效能大打折扣，甚至毫無用途。因此，數位影像品質回復或強化已為一重要之研究議題。近年基於深度學習技術的快速發展，已有許多基於深度學習網路之影像品質回復技術問世。然而，目前架構大多基於端對端之監督式學習且利用人工合成之訓練影像資料集。其主要問題為以人造訓練資料所訓練之網路未必適合於真實世界之影像品質下降問題，且真實低品質影像及其高品質版本配對之資料集卻難以取得。因此，最近基於跨領域 (cross-domain) 之深度學習已被研究來解決可能之領域間隔閡的問題。本論文提出研究基於跨領域深度學習之影像品質回復技術，並嘗試解決目前方法潛在的可能問題，例如：(1)有限的一般化特性：可能使得現有方法難以適用於不同種類的影像；(2)領域偏移問題：對於無成對訓練資料之非監督式學習，可能會因不容易學到好的影像特徵表示法及因為低品質影像之影像雜訊變異過大的關係導致領域偏移；及(3)不明確之領域邊界：當訓練影像之雜訊變異過大及影像內容過於複雜且無成對訓練資料時，低品質及高品質影像間的領域界線不明，使得不易達成良好之跨領域學習。為了解決上述問題及考慮其實際應用，本論文提出一基於跨領域非監督式深度學習之影像雜訊去除網路架構。我們的目標為根據輸入之雜訊影像資料集學習影像特徵表示法，並使得此表示法能貼近乾淨影像之特徵表示法，以期達到更佳的影像品質回復。本論文提出利用雙向生成對抗網路將非成對之訓練影像分別做雙向之影像轉換 (雜訊轉換成乾淨影像及乾淨轉換成雜訊影像)，並使用多項影像空間域及影像頻率域之損失函數以訓練一影像雜訊去除 (或噪聲去除) 深度學習網路。在實驗階段，我們使用了多個知名影像資料集 (CBSD68、SIDD及NIH-, AAPM- and Mayo Clinic-sponsored Low Dose CT Grand Challenge) 來訓練及測試所提出的深度學習模型。實驗結果已證實所提出的方法優於傳統基於非深度學習及近年具代表性之基於深度學習方法且適合用於解決實際問題。

關鍵字

影像雜訊去除；非監督式網路；深度學習；生成對抗網絡

並列摘要

Digital multimedia data have been ubiquitous in our daily life, especially for images and videos. For example, a huge amount of image data may be captured from different mobile devices or ubiquitous surveillance cameras. The huge amount of image data may enable different types of applications for our daily life, such as face/object detection and recognition, event detection, security monitoring, environment mining, autonomous driving, medical diagnosis, industrial inspection, and social media mining. However, image sources are highly diverse and their qualities are not easy to control, and thus, low-quality images may significantly degrade the performances of the related applications. Therefore, image quality restoration has been a popular and important research topic. With the recently rapid development of deep learning techniques, several deep learning-guided image restoration frameworks have been presented. Most of them were end-to-end supervised deep networks trained by synthesized paired image datasets, which may not fit real-world problems. To solve the problem that real paired training data are hard to collect, in recent, cross-domain deep learning is investigated to solve the problem of domain gap, which has been also applied to unsupervised image restoration via deep learning. In this thesis, we investigate cross-domain deep learning-based image restoration for single image denoising and solve the problems remained in currently most cross-domain learning frameworks, described as follows: (i) limited generalization: a learned deep model may not be well generalizable to different types of images; (ii) domain shift: an unsupervised domain-adaption model may not extract strong enough feature representations due to the high diversity of noises in input image data, which may cause the domain shift problem; and (iii) unclear domain boundary: high diversity of noises and complicated image contents may blur domain boundaries between unpaired image inputs, resulting in poor image reconstruction performance. To solve the above mentioned problems, a novel cross-domain unsupervised deep learning network is presented in this thesis for single image noise removal. Our goal is to learn invariant feature representation from input noisy images, which would be expected to align the representation of clean images for better image restoration. More specifically, we propose a generative adversarial network (GAN)-based architecture with different types of discriminators and loss functions in both image spatial and frequency domains. In our framework, we aim at learning two image generators to transfer noisy images to clean images as well as clean images to noisy images, respectively, based on unpaired training images. Extensive experimental results on several well-known image datasets, such as CBSD68、SIDD及NIH-, AAPM- and Mayo Clinic-sponsored Low Dose CT Grand Challenge, have verified that the proposed deep learning model for image denoising outperforms the traditional non-deep-learning-based and the state-of-the-art deep learning-based methods quantitatively and qualitatively.

並列關鍵字

Image denoising ； Unsupervised Leaning ； Deep Learning ； Generative Adversarial Network

參考文獻

[1] Zhang, K., Zuo, W., Chen, Y., Meng, D., & Zhang, L. (2017). Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE transactions on image processing, 26(7), 3142-3155.

Google Scholar

[2] Moen, T. R., Chen, B., Holmes III, D. R., Duan, X., Yu, Z., Yu, L., ... & McCollough, C. H. (2021). Low‐dose CT image and projection dataset. Medical physics, 48(2), 902-911.

Google Scholar

[3] Kim, N., Jang, D., Lee, S., Kim, B., & Kim, D. S. (2021). Unsupervised Image Denoising with Frequency Domain Knowledge. arXiv preprint arXiv:2111.14362.

Google Scholar

[4] Ahn, N., Kang, B., & Sohn, K. A. (2018). Fast, accurate, and lightweight super-resolution with cascading residual network. In Proceedings of the European conference on computer vision (ECCV) (pp. 252-268).

Google Scholar

[5] Kim, Y., Soh, J. W., Park, G. Y., & Cho, N. I. (2020). Transfer learning from synthetic to real-noise denoising with adaptive instance normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3482-3492).

Google Scholar

主題瀏覽