運用學習與預測方法之無損影像壓縮技術

隨著機器學習在分類和預測領域上的蓬勃發展，我們也發現它應用在影像壓縮上的潛力。我們知道無損影像壓縮最重要的三步驟分別是：預測、預測殘差更正、熵編碼，而在本論文中，我們會針對預測和熵編碼的部分進行改善。首先，我們加入斜向梯度的概念將傳統的預測模型梯度調整預測修改為斜向梯度調整預測，而且我們也透過動態調整預測窗格的方式將線性預測修改成動態窗格線性預測。除此之外，我們提出的動態窗格線性預測也包含相似度矩陣的概念，它會依照像素間的相似度調整對於預測結果的權重。再來，我們提出一個利用鄰近像素資訊來結合不同預測技巧的深層類神經網路模型，並且利用此模型獲得更準確的預測結果。另外，我們知道在動態模型算術編碼下，熵編碼的壓縮效率和使用的模型有正向關係，因此，我們提出一個強健且結合區域幾何特性和預測殘差的動態模型估計器，然後在這個估計器所建構出來的動態模型下，我們提出一些能夠更快收斂到模型該有的機率分布的調整模型頻率表的技巧。最後，我們提出一套熵編碼流程，它能夠依照預測殘差分布來動態決定算術編碼該使用的模型。此外，這套編碼流程也捨棄了傳統的模型頻率表，改為直接調整超拉普拉斯函數去逼近不同模型的機率分布，這樣的作法能夠獲得更快的收斂速度，也代表更好的壓縮效率。

關鍵字

無損影像壓縮；深層類神經網路；梯度調整預測；斜向梯度調整預測；線性預測；動態窗格線性預測；模型頻率表；超拉普拉斯函數；熵編碼；動態模型算術編碼

並列摘要

As the emerging popularity of machine learning application on classification and prediction, we discover its potential on image compression. We know that the three main pillars for lossless image compression are prediction, prediction residual correction and entropy coding. In this work, we focus on improving the prediction and entropy coding stages. Firstly, we improve on traditional prediction techniques such as gradient adjusted prediction (GAP) and weighted linear prediction by modifying them into diagonal GAP and dynamic window weighted linear prediction which adds diagonal gradient information and adaptively adjust the prediction window size according to local information. In addition, our proposed dynamic window weighted linear prediction also includes the concept of weighting matrix which places different emphasis on pixels according to their similarity with the current pixel. Secondly, we propose a deep neural network model which utilizes neighboring pixels to adaptively combine different prediction techniques into a more accurate prediction. Moreover, since the coding efficiency of an entropy coder is positively related to the context being chosen if context arithmetic coding with frequency table is applied, we propose a robust context model estimator which utilizes local geometric and prediction residue information as well as some efficient techniques for updating the context model frequency table so as to approximate the distribution of each context in a faster rate. Finally, we propose a distribution dependent model for the generation of context in context adaptive arithmetic coding which will take the histogram of prediction residue into consideration while generating context. In addition, we also propose a novel way of approaching context arithmetic coding without the use of the traditional frequency table by adjusting the hyper-Laplacian distribution to model the context probability distribution, which results in faster convergence rate of the probability distribution as well as better overall entropy coding efficiency.

並列關鍵字

Lossless image compression ； deep neural network ； gradient adjusted prediction ； diagonal gradient adjusted prediction ； weighted linear prediction ； dynamic window weighted linear prediction ； frequency table ； hyper-laplacian ； entropy coding ； context adaptive arithmetic coding

參考文獻

[1] G. K. Wallace, “The JPEG still picture compression standard,” Commun. ACM,

Google Scholar

vol. 34, pp. 30-44, Apr. 1991.

Google Scholar

[2] X. Wu and N. Memon, “CALIC: A context-based adaptive lossless image codec,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP- 96, vol. 4, pp. 1890-1893, 1996.

Google Scholar

[3] X. Li and M. Orchard, “Edge-directed prediction for lossless compression of natural images,” IEEE Trans. Image Processing, vol. 10, pp. 813-817, 2001.

Google Scholar

[4] 酒井善則、吉田俊之共著，白執善編譯，“影像壓縮技術”，全華，2004。

Google Scholar

國際替代計量

運用學習與預測方法之無損影像壓縮技術

全文下載

主題瀏覽