基於壓縮感測之語音增強

語增增強一直以來是許多學者嘗試解決的課題，然而時至今日，仍未發展出一個令人滿意可以處理各種不同特性的噪音的方法。由於不適當的錄製環境或錄製裝置的不完美，噪音是不可避免的。而含噪音的訊號會影響後續語音訊號之處理，因此有效的語音增強是重要的。語音增強可以視為是一種估測問題－從雜訊訊號中準確地估測出語音訊號。我們假設語音滿足某種統計模型，噪音是與語音無相關(uncorrelated)之隨機變數，我們可以利用這個特性並根據某種誤差準則來求得增強後之語音。然而，語音滿足何種統計模型以及要用何種物差準則仍是一個尚在發展的問題。而近十年，一個新的訊號取樣及重建的方法，壓縮感測被提出，壓縮感測給了我們一個新的估測訊號的方法，因此本文主要探討如何結合壓縮感測來進行語音強化。首先我們將訊號轉至時頻域上，並假設我們可以將該時頻圖轉到一個稀疏的轉化域上。接著我們利用遺失資料插補技術(missing data technique)以及壓縮感測對雜訊的時頻圖做處理。從我們最後的實驗結果得知，我們的方法在許多噪音下都能有很好的表現，此外，我們也適合用來處理傳統方法無法處理的噪音，最後，我們也進一步探討，我們的方法特別能夠針對某種特性的噪音進行處理。

關鍵字

遺失資料插補；語音增強；壓縮感測；噪音去除；遺失資料遮罩

並列摘要

Speech enhancement is an active issue which many researchers have devoted to addressing it. However, there is still not a satisfactory method which can deal with different noises. Noise is inevitable, due to the improper recording environmental or imperfect recording device. It is found that the following speech processing would be affected by noise. Therefore, speech enhancement is a very important topic. We can regard speech enhancement as an estimation problem which we estimate the clean speech from noisy measurement. Assume the speech signal is satisfy some kind of statistic model and noise is an uncorrelated random process. We can estimate the enhance signal according to some distortion measure. However, what kind of speech model and distortion measure should be used is still a developing issue. In recent years, a new signal acquisition and reconstruction method, compressive sensing has been proposed this decade. Compressive sensing gives us a new sight of estimating the signal. Hence, in the thesis, we explore how to perform speech enhancement by applying compressive sensing. According to our experimental results, we can find out that the performance of the proposed method performs well in various noise types. Besides, it is much better for dealing with the noise which cannot be addressed well in traditional methods.

並列關鍵字

missing data imputation ； compressive sensing(CS) ； speech enhancement ； noise removal ； missing data mask

參考文獻

[2] J. F. Gemmeke, H. V. Hamme, B. Cranen, and L. Boves, “Compressive sensing for missing data imputation in noise robust speech recognition,” IEEE J. Sel. Topics Signal Process., vol. 4, no.2, Apr. 2010.

[3] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans Signal Process., vol. 54, no. 11, Nov, 2006.

[5] M. Cooke, P. Green, L. Josifovski, and A. Vizinho, “Robust automatic speech recognition with missing and unreliable acoustic data,” Speech Commun., vol. 34, pp. 267–285, 2001.

[7] J. F. Gemmeke and B. Cranen, “Using sparse representations for missing data imputation in noise robust speech recognition,” in Proc. EUSIPCO, 2008.

[8] D. Lee and H. Seung, “Learning the parts of objects by non-negative matrix factorization,” Nature, vol. 401, no. 6755, pp. 788–791, 1999.

國際替代計量

基於壓縮感測之語音增強

未授權

主題瀏覽