透過您的圖書館登入
IP:3.17.128.129
  • 學位論文

基於加權算法與機器學習之影像超解析技術

Super-Resolution Based on Advanced Weighting and Learning Techniques

指導教授 : 丁建均
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


現今數位影像取得十分容易,而高解析度的數位影像常需要被用在後續的影像處理和分析上,然而,受限於感光元件和光學上的限制,利用數位相機取得之影像的空間解析度是有限的。為了取得高解析度的影像,提升硬體設備的成本過高,而影像超解析(image super-resolution)提供了一種方便且經濟的解決方法。 影像超解析的目標是從一張低解析度影像產生一張高解析度影像,是影像處理中一個基本的問題,此技術常被用在監視系統、醫學診斷、遙測技術等高階電腦視覺應用中。由於一張低解析度影像可以由多張不同的高解析度影像對應到,影像超解析是不存在唯一且穩定解的不適定問題(ill-posed problem)。在本篇論文中,我們提出了兩種影像超解析的方法,一種是擷取不同影像超解析方法的優點,將這些方法做結合的模型,另一種是基於深度學習的方法。 傳統的影像超解析方法如雙線性差值、三次卷積差值十分簡單、快速,但會產生模糊(blurring)與振鈴(ringing)等失真的情況。為了解決這些問題,我們提出了一個能夠結合不同方法優點的模型,我們分析了三種影像超解析方法,發現不同的影像超解析方法適用於影像不同特徵的區域,因此我們對影像抽取了多種特徵,藉由統計這些特徵值和各方法產生結果的誤差,針對每一張輸入影像,計算不同方法的加權平均得到最終結果。 隨著卷積神經網路(convolutional neural network)與深度學習(deep learning)近年來的發展,利用大量資料訓練出的模型能夠在電腦視覺的許多應用中達到很好的效果,在本篇論文中,我們提出了另一種基於深度學習的模型。由於影像在不同頻帶具備不同的特徵,我們先將影像利用小波轉換(wavelet transform)分成四個不同的頻帶,分別輸入到四個模型中,讓每個模型能學到特定頻帶的特徵,提升訓練的效能。此外,我們也採用了稠密連接(dense connection)來使網路中不同層的特徵能更有效地被運用。在測試階段,我們使用幾何性自集合(geometric self-ensemble)來增強模型的表現。

並列摘要


Nowadays, digital images are easy to access, and high-resolution images are often required for later image processing and analysis. However, the spatial resolution of images captured by digital cameras is limited by principles of optics and the size of imaging sensors. While constructing optical components that can capture very high-resolution images is prohibitively expensive and impractical, image super-resolution (SR) provides a convenient and economical solution. Image super-resolution aims to generate a high-resolution (HR) image from a low-resolution (LR) input image. It is an essential task in image processing and can be utilized in many high-level computer vision applications, such as video surveillance, medical diagnosis and remote sensing. Super-resolution is an ill-posed problem since multiple HR images could correspond to the same LR image. In this thesis, we propose two algorithms for image super-resolution. The first one is to combine and take advantage of different image super-resolution methods while the second one is based on deep learning. Conventional image super-resolution methods, including bilinear interpolation and cubic convolution interpolation, are intuitive and simple to use. However, they often suffer from artifacts such as blurring and ringing. To deal with this problem, we propose a weighting-based algorithm that takes advantage of three different image super-resolution methods and generates the final results from the combination of these methods. We extract features of the input LR image and investigate the performance of the chosen methods under different features. Results from the candidate methods are combined using a weighted average based on the statistical values of the training data. As the development of convolutional neural networks and deep learning in recent years, models trained on large scale of datasets achieve favorable performance on many computer vision applications. In this thesis, we propose another deep learning-based approach for image super-resolution. We use the wavelet transform to separate the input image into four frequency bands, and train a model for each sub-band. By processing information from different frequency bands via different CNN models, we can extract features more efficiently and learn better LR-to-HR mappings. In addition, we add dense connection to the model to make better use of the internal features in the CNN model. Furthermore, geometric self-ensemble is applied in the testing stage to maximize the potential performance.

參考文獻


[1] R. Keys, “Cubic Convolution Interpolation for Digital Image Processing,” IEEE Transactions on Acoustics, Speech, Signal Processing, vol. 29, no. 6, pp. 1153-1160, 1981.
[2] X. Li and M. Orchard, “New edge-directed interpolation,” in Proceedings of the IEEE International Conference on Image Processing, 2000.
[3] D. Zhou, X. Shen, and W. Dong, “Image zooming using directional cubic convolution interpolation,” IET image processing, vol. 6, no. 6, pp. 627-634, 2012.
[4] H. Chang, D.-Y. Yeung, and Y. Xiong, “Super-resolution through neighbor embedding,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2004.
[5] S. Roweis, S. Lawrence, “Nonlinear dimensionality reduction by locally linear embedding,” Science (2000) 2323-2326.

延伸閱讀