以母語輔助建立個人化辨識網路並應用於英文錯誤發音偵測之研究

電腦輔助發音訓練（Computer Assisted Pronunciation Training, CAPT）系統結合了音訊處理與語音辨識技術，提供英語學習者在虛擬環境下訓練語言發音的機會。目前已有許多完成的系統供一般大眾使用，但這些系統仍有辨識效果不佳和無法正確辨認錯誤發音的問題。問題的原因來自於使用者的語音可能帶有明顯的母語口音，導致系統辨識效能較差。而一般系統使用的發音混淆網路（Pronunciation Confusion Networks, PCN），雖能協助辨識使用者的錯誤發音，但PCN的組成是由固定的規則資訊，無法因應使用者多變的發音情形，因此無法正確辨認出任何錯誤混淆的發音。在本論文中，我們實作兩種方法來改善上述的問題。第一，我們將使用者的母語資訊增加到辨識系統中，使系統能偵測語句中是否發生以母語方式發音的情形，改善系統對參雜母語口音語句的辨識率。第二，讓系統自動偵測使用者個人的發音錯誤型態，動態新增至錯誤規則定義中，將發音混淆網路進行使用者個人化調適，建立個人化發音混淆網路，讓系統能更有效地抓出使用者的錯誤發音。實驗後顯示，增加母語資訊於辨識系統中，其未能有效改善系統辨識使用者口音的問題，推測原因在於決定對應的方式粗略，又未將聲學模型進行正規調適，於是將不適當的對應模型加入辨識網路，導致徒增加發音混淆網路的複雜度使辨識率降低。而個人化的發音混淆網路，實驗證明能有效輔助系統辨認出固定式PCN無法偵測的發音錯誤情形，並提昇了3.97%的辨識正確率。

關鍵字

電腦輔助發音訓練；發音混淆網路；語音辨識；語者相關發音錯誤

並列摘要

Most of the current computer assisted pronunciation training (CAPT) systems apply the concept of pronunciation confusion networks (PCN) for reliably detecting erroneous pronunciations. However, usually the PCN is constructed manually by subjective rules without considering speaker-dependent pronunciation variations. This thesis proposes an approach to alleviate this problem. Firstly, with the mapping between phone segments of the L2 and acoustic models of the mother tongue, the L1-assisted bilingual PCN is incorporated into our CAPT system for detecting errors of accented non-native speech. Secondly, speaker-dependent pronunciation errors are collected by an iterative procedure automatically to construct a personal PCN. Both types of PCNs take L1 and an individual into consideration, trying to construct the overall recognition network for pronunciation training in a flexible and practical way. 　　Our experiments show that the PPCN yields the best improvement, though the performance of BPCN downgrades when comparing with the PCN baseline system. The possible reason is that the BPCN generated much more speaker-independent paths for individuals, and even worse providing higher perplexity. By using PPCN, the detection rate of erroneous phones of accented non-native speech is increased from 77.6% to 81.57%, when compared to the manually constructed PCN.

並列關鍵字

computer assisted pronunciation training ； pronunciation confusion network ； speech recognition ； speaker-dependent pronunciation error

參考文獻

[9] Zheng Xu, “The interlanguage phonology of Mandarin Learners of English and the Gradual Learning Algorithm”, In Ninth Conference on Laboratory Phonology, 2004

[12] The CMU Pronouncing Dictionary,

[13] Steve Young, The HTK Book version 3.3, Microsoft Corporation, 2005

[1] Ronen, O., Neumeyer, L., and Franco, H. “Automatic detection of mispronunciation for language instruction,” in Proc. Eurospeech, 1997, pp. 649-652.

Google Scholar

[2] Witt, S. M. and Young, S. J., "Off-line Acoustic Modeling of Non-native Accents," in Proc. Eurospeech, 1999, pp. 1367-1370.

Google Scholar

國際替代計量

以母語輔助建立個人化辨識網路並應用於英文錯誤發音偵測之研究

未授權

主題瀏覽