以非負矩陣分解法提升維納濾波器架構之噪聲消除效能

語音是人類生活中傳遞訊息最直接的方法，也是人類極為重要的訊息來源。然而這些語音訊息也往往容易受到噪聲的干擾而影響了生活的品質。有鑑於此，在過去數十年間，多種噪聲消除演算法不斷的被提出來試圖來消除背景噪聲並提升語音品質。現今最為廣泛應用之噪聲消除演算法為非監督式(Unsupervised)架構，其成功的例子有: Wiener Filter, LogMMSE, KLT…等。過去的研究指出，非監督式噪聲消除法在穩態噪聲(例如:低頻穩定噪聲、粉紅色雜訊…等)情況下已有卓越的表現,但對於非穩態噪聲(例如:人聲噪聲)類型的消噪能力仍存在許多挑戰。近年，許多學者開始採用監督式(Supervised)噪聲消除法來達成噪聲消除，以克服非監督式噪聲消除法之缺點，成功的例子例如:Deep Denoisy Autoencoder (DDAE)法。當訓練語料量足夠的情況下，DDAE法比非監督式噪聲消除法有更佳的噪聲消除能力。然而，在不易取得大量訓練語料的情況下，此方法在應用上將有所限制。有鑑於此，本研究提出一個新式的噪聲消除演算法，稱Adaptive Wiener-NMF (AWNMF)，以解決上述非監督式與監督式噪聲消除法之缺點，如:(1).非穩態噪聲情境下之效益不彰、(2).需大量訓練語料進行訓練。由多項的客觀聲音評估指標(PESQ, SSNRI,SDI)證明，多種噪聲環境下(例如:嬰兒哭聲、警笛 …..等等)，本論文所提出之AWNMF演算法比目前常見之噪聲消除法(例如: LogMMSE, KLT, Wiener, NMF-Based)有更佳之噪聲消除效益。此外，當訓練語句極少之情況下，本論文發展出之AWNMF演算法比DDAE噪聲消除法有更佳之噪聲消除效益。總結上述的研究結果，AWNMF演算法將是一個創新且有效之噪聲消除法。

關鍵字

語音增強；雜訊追蹤；噪聲消除

並列摘要

Speech is one of the most direct ways for humans to communicate. However, vocal messages are often susceptible to noise, perhaps even to the extent that one’s quality of life may get affected. In the past few decades, a variety of noise reduction algorithms have been developed with the aim of eliminating background noise in order to improve the quality of speech sounds. The unsupervised algorithms, such as Wiener filter, logMMSE, KLT, etc., are among the most widely used and successful noise reduction (NR) techniques. They have been reported in numerous studies and the unsupervised algorithms exhibit outstanding performances under the stationary-noise-environment, e.g. low-frequency noise, pink noise, etc.. Even so, there are still many challenges for unsupervised noise reduction under non-stationary noise conditions. Take vocal noise for example. Recently, many researchers have employed the supervised noise reduction technique to reduce non-stationary noise while also attempting to overcome the disadvantages of unsupervised noise reduction methods. With sufficiently large data training, the deep denoisy autoencoder (DDAE) method has been shown to perform well on noise reduction. However, due to the low availability of training data, its application would be limited. In our work, we propose a new noise reduction algorithm, called Adaptive Wiener-NMF (AWNMF), to solve the problems of both the unsupervised and supervised noise reduction methods: poor performance for non-stationary noise and the requirement of training data. We show that the AWNMF method has better performance than the common method in the analyses of the sound of objective evaluation index (PESQ, SSNRI, SDI) and the DDAE method when lacking in training data. In conclusion, we have developed an innovative and effective noise reduction method.

並列關鍵字

Speech Enhancement ； Noise Tracking ； Noise Reduction

參考文獻

[1] Y. Xu, J. Du, L. R. Dai, and C. H. Lee, "An Experimental Study on Speech Enhancement Based on Deep Neural Networks," IEEE Signal Processing Letters, vol. 21, pp. 65-68, 2014.

Google Scholar

[2] J. Chen, J. Benesty, Y. Huang, and E. Diethorn, "Fundamentals of noise reduction in spring handbook of speech processing," ed: Springer, 2008.

Google Scholar

[3] S. S. Wang, H. T. Hwang, Y. H. Lai, Y. Tsao, X. Lu, H. M. Wang, et al., "Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm," in 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2015, pp. 365-369.

Google Scholar

[4] S. Rangachari and P. C. Loizou, "A noise-estimation algorithm for highly non-stationary environments," Speech communication, vol. 48, pp. 220-231, 2006.

Google Scholar

[5] X. Lu, Y. Tsao, S. Matsuda, and C. Hori, "Speech enhancement based on deep denoising autoencoder," in Interspeech, 2013, pp. 436-440.

Google Scholar

主題瀏覽