基於結合語音及雜訊統計估測方法之語音增強系統

在現實生活中的自動語音處理系統，由於語音訊號容易受到環境雜訊的影響而降低其辨識度。因此一般均藉由語音增強技術來降低雜訊干擾，以提升系統效能。本論文提出一種有效濾除語音訊號背景雜訊之語音增強系統。其方法主要是結合語音及雜訊振幅頻譜所估測之語音增強以進一步有效去抑制雜訊成份。系統中的事前訊雜比（Signal-to-Noise Ratio； SNR）利用兩階段雜訊消除（Two-Step Noise Reduction； TSNR）演算法來改善傳統使用直接決定（Decision-Directed）演算法所產生的缺點。經過前端系統抑制後的訊號仍然存在少許的雜訊，為了解決這個問題，在系統後端我們利用一個postfilter針對增益函數進行調整，濾除在非語音區間的殘餘雜訊，使增強後的語音能夠達到更好的品質。在實驗中，我們利用兩種客觀評估方法：分段式SNR（Segmental SNR）和語音品質感知評估（Perceptual Evaluation of Speech Quality； PESQ）來評估語音增強之品質。由實驗數據顯示，本文所提出的方法的確可以提升抑制雜訊的效能，且在聽覺品質上也獲得明顯的提升。

關鍵字

語音增強；兩階段雜訊消除；訊雜比

並列摘要

The performance of an automatic speech processing system is often degraded due to the embedded noise in the processed speech signal. Therefore, the speech enhancement technology is applied to the automatic speech processing systems to reduce noise interference and increase system efficiency. In this thesis, we propose a speech enhancement system that reduces the background noise by combining both the spectral magnitude estimators for speech and noise. The a priori signal-to-noise ratio (SNR) is refined by two-step noise reduction (TSNR) to remove the drawbacks of the decision-directed approach. However, there still remains residual noise in the enhanced speech. In order to solve this problem, we add a postfilter in the terminal of the system to eliminate residual noise for speech pauses. Finally, we take two objective measures (the segmental SNR and the perceptual evaluation of speech quality, PESQ) to assess the quality of the enhanced speech. Experimental results show the effectiveness of the proposed speech enhancement system.

並列關鍵字

speech enhancement ； two-step noise reduction (TSNR) ； signal-to-noise ratio (SNR)

參考文獻

[1] S. F. Boll, “Suppression of Acoustic Noise in Speech Using Spectral Subtraction,” IEEE Transactions on Speech and Audio Processing, pp. 113-120, 1979.

[2] J. S. Lim and A. V. Oppenheim, “Enhancement and Bandwidth Compression of Noisy Speech,” in Proceedings of the IEEE, pp. 1586-1604, 1979.

[3] Y. Ephraim and D. Malah, “Speech Enhancement Using a Minimum-Mean Square Error Short-Time Spectral Amplitude Estimator,” IEEE Transactions on Speech and Audio Processing, pp. 1109-1121, 1984.

[4] Y. Ephraim and D. Malah, “Speech Enhancement Using a Minimum Mean-Square Error Log-Spectral Amplitude Estimator,” IEEE Transactions on Acoustics Speech and Signal Processing, pp. 443-445, 1985.

[5] Y. Lu and P. Loizou, “Speech Enhancement by Combining Statistical Estimators of Speech and Noise,” in Processing IEEE International Conference on Acoustics, Speech, Signal Processing, pp. 4754-4757, 2010.

國際替代計量

基於結合語音及雜訊統計估測方法之語音增強系統

未授權

主題瀏覽