透過您的圖書館登入
IP:13.59.9.236
  • 學位論文

基於最大相似度之短程預估器參數估測及其在語音編碼器之應用

Short-Term Predictor Parameter Estimation Based on Maximum-Likelihood and Its Application to Speech Coder

指導教授 : 簡福榮

摘要


為了提升編碼語音之品質須精確重建短時域功率頻譜之包絡曲線。大部分的低位元率語音編碼器都是使用線頻譜對係數(line spectrum pair parameters, LSP)來表示短時域的語音頻譜。語音編碼器通常操作在背景噪音環境下。當背景噪音嚴重地降低語音參數抽取之性能時,設計語音編碼器,使其在噪音條件下仍然可以維持好的性能。在本論文中,我們提出一種從受噪音干擾之語音中聯合估算語音與噪音之LPC參數的向量量化器。從語音與噪音碼簿中搜尋一組最大相似度之碼向量組合來獲得語音與噪音LPC參數之估測。本論文也將此量化器應用於混合激發線性預估(mixed excitation linear prediction, MELP)語音編碼器上。我們也對此兩個量化器在各種噪音下進行主觀評估與客觀評估。實驗結果顯示,本論文所提出的量化器比MELP所使用的多階向量量化器有更好的性能。

並列摘要


Accurate reconstruction of the envelope of the short-term power spectrum is necessary for the coded speech. For most of low bit-rate speech coders, the line spectrum pair (LSP) parameters are widely used to represent the spectral envelope of speech. Speech coders usually operate in noisy background environments. As background noise severely degrades the extraction of speech parameter, it is crucial that the coder should be designed in such a way that it can maintain good performance under noisy conditions. In the thesis, we proposed a vector quantizer for the joint estimation of linear predictive coding (LPC) parameters of speech and noise from noisy observation. Maximum-likelihood estimates of the speech and noise LPC parameters are obtained by searching for the combination of codevectors that maximize the likelihood. Another LSP encoding method, the multistage vector quantizer (MSVQ) that is adopted by mixed excitation linear prediction (MELP) coder, is included for comparison. Experimental results of subjective and objective tests show that the vector quantizer proposed in this thesis performs better than the MSVQ under various types of noisy environments.

參考文獻


[1] J. P. Campbell and T. E. Tremain, “Voiced / unvoiced classification of speech with applications to the US government LPC-10e algorithm,” in Proc. Int. Conf. Acoust., Speech, Signal Processing, pp. 473-476, 1986.
[2] M. Schroeder and B. Atal, “Code-excited linear prediction(celp):High-quality speech at very low bit rates,” in Proc. Int. Conf. Acoust., Speech, Signal Processing, pp. 937-940, 1984.
[3] J. Makhoul, “Linear prediction: A tutorial review,” Proc. IEEE, vol. 63, no. 4, pp. 561-580, Apr. 1975.
[5] A. H. Gray and J. D. Markel, “Quantization and bit allocation in speech processing,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-24, pp. 459-473, 1976.
[6] R. Viswanathan and J. Makhoul, “Quantization properties of transmission parameters in linear predictive systems,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-23, pp. 309-321, 1975.

延伸閱讀