透過您的圖書館登入
IP:18.223.106.100
  • 期刊
  • OpenAccess

An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition

並列摘要


This paper presents an empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimization (WEM) criterion, used to rescore N-best word strings, is appropriately modified for a Mandarin LVCSR system. Finally, a series of speech recognition experiments is conducted on the MATBN Mandarin Chinese broadcast news corpus. The experiment results demonstrate that the MPE training approach reduces the character error rate (CER) by 12% for a system initially trained with the maximum likelihood (ML) approach. Meanwhile, for unsupervised acoustic model adaptation, MPE-based linear regression (MPELR) adaptation outperforms conventional maximum likelihood linear regression (MLLR) in terms of CER reduction. When the WEM decoding approach is used for N-best rescoring, a slight performance gain over the conventional maximum a posteriori (MAP) decoding method is also observed.

參考文獻


Chen, B.,J.-W. Kuo,W.-H. Tsai(2005).Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription.International Journal of Computational Linguistics and Chinese Language Processing.10(1),1-18.
Chien, J.-T.,C.-H. Huang,K. Shinoda,S. Furui(2006).Towards Optimal Bayes Decision for Speech Recognition.Proc. ICASSP'06.
Doumpiotis, V.,S. Tsakalidis,W. Byrne(2003).Discriminative Training for Segmental Minimum Bayes Risk Decoding.Proc. ICASSP'03.
Doumpiotis, V.,S. Tsakalidis,W. Byrne(2003).Lattice Segmentation and Minimum Bayes Risk Discriminative Training.Proc. Eurospeech'03.
Doumpiotis, V.,W. Byrne(2004).Pinched Lattice Minimum Bayes Risk Discriminative Training for Large Vocabulary Continuous Speech Recognition.Proc. ICSLP'04.

被引用紀錄


牛學文(2007)。最小化音素錯誤鑑別式訓練法則應用於華語語者調適之研究〔碩士論文,國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2007.00211
許碩斌(2006)。最小音素錯誤鑑別式訓練法則應用於連續音素辨識系統之初步研究〔碩士論文,國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2006.00239
林宥余(2010)。使用取樣點式聲學參數之音素分段〔碩士論文,國立交通大學〕。華藝線上圖書館。https://doi.org/10.6842/NCTU.2010.00591
Syu, Y. C. (2015). 可獨立動態調整時脈之異質多核系統上的節能批次工作排程 [master's thesis, National Taiwan University]. Airiti Library. https://doi.org/10.6342/NTU.2015.11301
蔡文鴻(2004)。語言模型訓練與調適技術於中文大詞彙連續語音辨識之初步研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-2004200710361675

延伸閱讀