基於頻譜回復技術之語音增強

語音信號會受背景雜訊影響而導致語音品質降低，而語音增強系統的主要目的為降低背景雜訊對語音訊號的影響，並且使增強後的語音訊號有較低的語音失真。語音增強系統也可以視為一個前處理器，可應用於語音訊號處理系統，如語音辨識與語音編碼。而語音增強的方法有很多種，如以濾波器技術、頻譜回復技術與基於模型技術等語音增強技術。　　本文使用了五種頻譜回復技術之語音增強方法與三種雜訊追蹤方法，其中語音增強方法包括最小均方誤差(MMSE)、最小均方誤差之對數頻譜(MMSE-LSA)、最大事後頻譜(MAP)、最大概似頻譜振幅(MLSA)與最大概似頻譜功率(MLSP)，雜訊追蹤方法包括最小統計法(MS)、最小控制遞迴平均法(MCRA)與改善式最小控制遞迴平均法(IMCRA)。以上的方法互相搭配，其實驗結果將與濾波器技術之語音增強中的溫妮濾波器(Wiener Filter)比較。實驗結果顯示，將頻譜回復技術之語音增強與溫尼濾波器的結果相比，前者可以獲得比較好的改善，其中使用頻譜回復技術中的MMSE語音增強法搭配MCRA雜訊追蹤法，可以獲得較明顯的效果。將MMSE語音增強法搭配MCRA雜訊追蹤法應用於語音辨識上，就整體平均而言，單字準確率與句子正確率皆有提升。

關鍵字

語音增強；雜訊追蹤；雜訊消除；頻譜回復

並列摘要

Speech signals are tend to decrease the speech quality when corrupted by background noises. The aim of speech enhancement is to reduce the background noise from a noisy speech signal while keeping the speech distortion as low as possible. Speech enhancement systems could also be a pre-processer for speech processing systems such as speech recognizer, speech coder, and so on. There are three categories for speech enhancement including filtering techniques, spectral restoration techniques, and speech model techniques. In this thesis five speech enhancement methods based on spectral restoration techniques are investigated, including minimum mean-square error (MMSE), minimum mean-square error log-spectral amplitude (LSA), maximum a posteriori spectrum (MAP), maximum-likelihood spectral amplitude (MLSA), and maximum-likelihood spectral power (MLSP). The Wiener filter (WF) method is also included for comparison. Each method incorporates with three well-known noise tracking algorithms, including minimum statistics (MS), minima controlled recursive averaging (MCRA), and improved minima controlled recursive averaging (IMCRA) for recovering clean speech. The experimental results show that compared with the Wiener filter, all the five spectral restoration techniques provide better performance. Among all, the MMSE method incorporated with MCRA achieves the most significant enhancement performance. If apply it prior to a speech recognizer, the experimental results show that both word accuracy rate and sentence correct rate are increased obviously.

並列關鍵字

Speech Enhancement ； Noise Tracking ； Noise Reduction ； Spectral Restoration

參考文獻

[1] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Transactions, Acoustics Speech and Signal Processing, vol. ASSP-32, no. 6, Dec. 1984, pp. 1109–1121.

[2] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Transactions, Acoustics, Speech and Signal Processing, vol. ASSP-33, no. 4, Apr. 1985, pp. 443–445.

[3] J. Chen, Fundamentals of Noise Reduction in Spring Handbook of Speech Processing, Chapter 43, Springer, 2008.

[4] R. Martin, “Speech enhancement based on minimum mean-square error estimation and supergaussian priors,” IEEE Transactions on Speech and Audio Processing, vol.13, no. 5, Sep. 2005, pp. 845-856.

[5] E. Plourde and B. Champagne, “Auditory-based spectral amplitude estimators for speech enhancement,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 8, Nov. 2008, pp. 1614-1622.

國際替代計量

基於頻譜回復技術之語音增強

未授權

主題瀏覽