透過您的圖書館登入
IP:3.137.213.128
  • 學位論文

具碼簿驅動雜訊語音增強之1200 bps MELP語音編碼器

1200 kbps MELP Coder with Codebook Driven Noisy Speech Enhancement

指導教授 : 簡福榮
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


語音編碼器在乾淨的環境下操作都能有很好的語音品質,但是當操作在帶有雜訊的背景環境時,常常會造成語音品質的降低。為了提升編碼語音之品質須精確重建短時域功率頻譜之包絡曲線。大部分的低位元率語音編碼器都是使用線頻譜對係數(line spectrum pair parameters, LSP)來表示短時域的語音頻譜。在本論文中,我們提出一種從受噪音干擾之語音中聯合估算語音與噪音之LPC參數的向量量化器。從語音與噪音碼簿中搜尋一組最大相似度之碼向量組合來獲得語音與噪音LPC參數之估測。本論文也將此量化器分別應用於位元率為2400 bps與1200 bps之混合激發線性預估(mixed excitation linear prediction, MELP)語音編碼器上。我們也對此兩個語音編碼器在各種噪音下進行主觀評估與客觀評估。實驗結果顯示,結合本論文提出量化器之1200 bps MLEP雖然在客觀評估上沒有2400 bps MELP好,但在主觀評估上1200 bps MELP之分數與2400 bps MELP相當接近,並且比MELP所使用的多階向量量化器有更好的效能。

並列摘要


The MELP speech coder operating in clean environment can provide fair communication quality, but suffers speech degradation while operating in noisy background environment. For most low bit-rate speech coders, the line spectrum pair (LSP) parameters are widely used to represent the spectral envelope of speech. And the accurate reconstruction of the spectral envelope is necessary. In the thesis, we proposed a codebook driven noisy speech enhancement scheme for the joint estimation of linear predictive coding (LPC) parameters of speech and noise from noisy observation. Maximum-likelihood estimates of the speech and noise LPC parameters are obtained by searching for the combination of codevectors that maximize the likelihood. Another two LSP encoding methods, the multistage vector quantizer(MSVQ) for single frame or superframe that is adopted by 2400 bps and 1200 bps mixed excitation linear prediction (MELP) coder, are included for comparison. Experimental results show that the proposed 2400 bps MELP coder performs slightly better than the proposed 1200 bps MELP coder. In addition, the proposed 2400 bps and 1200 bps MELP coders with codebook driven noisy speech enhancement all are superior to the standard 2400 bps and 1200 bps MELP coders.

參考文獻


[1] J. P. Campbell and T. E. Tremain, “Voiced/unvoiced classification of speech with applications to the US government LPC-10e algorithm,” in Proc. Int. Conf. Acoust., Speech, Signal Processing, pp. 473-476, 1986.
[2] M. Schroeder and B. Atal, “Code-excited linear prediction (celp) :High-quality speech at very low bit rates,” in Proc. Int. Conf. Acoust., Speech, Signal Processing, pp. 937-940, 1984.
[3] L. M. Supplee, R. P. Cohn, J. S. Collura, and A. V. McCree, “MELP: The new federal standard at 2400 bps,” in Proc. Int. Conf. Acoust., Speech, Signal Processing, pp. 1591-1594, 1997.
[4] T. Wang, K. Koishida, V. Cuperman, A. Gersho, J.S Collura, “A 1200 bps speech coder based on MELP,” in Proc. Int. Conf. Acoust., Speech, Signal Processing, vol. 3, pp. 1375-1378, 2000.
[5] S. A. Spanias, “Speech coding: A tutorial review,” Proceedings of the IEEE, vol. 82, no. 10, pp. 1541-1582, oct. 1994.

延伸閱讀