Study of Associative Cepstral Statistics Normalization Techniques for Robust Speech Recognition in Additive Noise Environments

Feature statistics normalization techniques have been shown to be very successful in improving the noise robustness of a speech recognition system. In this paper, we propose an associative scheme in order to obtain a more accurate estimate of the statistical information in these techniques. By properly integrating codebook and utterance knowledge, the resulting associative cepstral mean subtraction (A-CMS), associative cepstral mean and variance normalization (A-CMVN), and associative histogram equalization (A-HEQ) behave significantly better than the conventional utterance-based and codebook-based versions in additive noise environments. For the Aurora-2 clean-condition training task, the new proposed associative histogram equalization (A-HEQ) provides an average recognition accuracy of 90.69%, which is better than utterance-based HEQ (87.67%) and codebook-based HEQ (86.00%).

並列關鍵字

Speech Recognition ； Noise-Robust Feature ； Codebook

參考文獻

Acero, A.(1990).Acoustical and environmental robustness in automatic speech recognition.Department of Electrical and Computer Engineering, Carnegie Mellon University.

Google Scholar

Acero, A.,Deng, L.,Kristjansson, T.,Zhang, J.(2000).HMM adaptation using vector Taylor series for noisy speech recognition.Proceeding of 2000 International Conference on Spoken Language Processing (ICSLP 2000).(Proceeding of 2000 International Conference on Spoken Language Processing (ICSLP 2000)).

Google Scholar

Atal, B.S.(1974).Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification.Journal of the Acoustic Society of America.55,1304-1312.

Google Scholar

Beattie, V. L.,Young, S. J.(1992).Hidden Markov model state-based cepstral noise compensation.Proceeding of International Conference on Spoken Language Processing (ICSLP 1992).(Proceeding of International Conference on Spoken Language Processing (ICSLP 1992)).

Google Scholar

Berstein, A. D.,Shallom, I. D.(1991).An hypothesized Wiener filtering approach to noisy speech recognition.(Proceeding of 1991 International Conference on Acoustics, Speech and Signal Processing (ICASSP 1991)).

Google Scholar

被引用紀錄

陳鴻彬（2006）。以能量為基礎之語音正規化方法研究及其於語音端點偵測之應用〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-0712200716113647

國際替代計量

Study of Associative Cepstral Statistics Normalization Techniques for Robust Speech Recognition in Additive Noise Environments

全文下載

主題瀏覽