  • 學位論文

基於隱藏式條件隨機域聲學模型之強健式華英混雜語音 辨認演算法

Mixed-Lingual Acoustic Modeling of Hidden Conditional Random Field for Robust Speech Recognition

指導教授 : 洪維廷


本論文提出以強健式演算法(論文中簡稱REST)訓練隱藏式條件隨機域(Hidden Conditional Random Fields,簡稱HCRF)華語/英語聲學模型,嘗試解決(1)混 雜語音語音辨認之抗雜訊問題和(2)混雜語音語音辨認之跨語系辨認錯誤問題。REST演算法可以提高HCRF模型對雜訊環境的辨識效能,接著透過鑑別式法則訓練HCRF模型,提升HCRF模型對語音模型的鑑別能力,並且大幅降低跨語言語音辨認之錯誤。根據一連串之實驗證明,基於HCRF語音模型之錯誤率平均值比傳統HMM降低約16.41%(Rover_2雜訊),並且跨語言語音辨認之錯誤大幅降低。


This thesis presents the robust training techniques for hidden conditional random fiels (HCRF)-based acoustic modeling of Mandarin/English mixed-lingual speech recognition. Two issues were dealt with: (1) mixed-lingual speech recognition against with noise effects and (2) cross-lingual errors in mixed-lingual speech recognition. We solved first issue with the REST algorithm and reduce the errors in second issue with a discriminative training algorithm combined by the REST algorithm(D-REST). The experimental results indicate that 16.4% averaged error rate reduction by the HCRF-based framework is achieved under ROVER_2 noise environment compared with the result by the traditional HMM approach. In additional, the cross-lingual error is improved significantly with the HCRF-framework in mixed-lingual speech recognition.


HMM HCRF Robust Training Algorithm


[1] L. Rabiner, “A tutorial on hidden markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, pp. 257–286, 1989.
[2] A. Varga and R. Moore, “Hidden markov model decomposition of speech and noise,” in Proceedings International Conference on Acoustics, Speech, and Signal
[5] Y.-H. Sung, C. Boulis, C. Manning, and D. Jurafsky, “Regularization, adaptation, and non-independent features improve hidden conditional random fields for phone classification,” in Proceedings IEEE Workshop on Automatic Speech Recognition & Understanding, pp. 347–352, 2007.
[6] B. H. Juang and S. Katagirl, “Discriminative learning for minimum error classification,” Signal Processing, IEEE Transactions on, vol. 40, pp. 3043–3054,1992.
[8] W.-T. Hong and S.-H. Chen, “A robust training algorithm for adverse speech recognition,” Speech Communication., vol. 30, pp. 273–293, 2000.
