透過您的圖書館登入
IP:3.140.242.165
  • 學位論文

以母語輔助建立個人化辨識網路並應用於英文錯誤發音偵測之研究

A Study on L1-assisted Personalized Recognition Networks for Pronunciation Error-Spotting in English Learning

指導教授 : 張智星
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


電腦輔助發音訓練(Computer Assisted Pronunciation Training, CAPT)系統結合了音訊處理與語音辨識技術,提供英語學習者在虛擬環境下訓練語言發音的機會。目前已有許多完成的系統供一般大眾使用,但這些系統仍有辨識效果不佳和無法正確辨認錯誤發音的問題。問題的原因來自於使用者的語音可能帶有明顯的母語口音,導致系統辨識效能較差。而一般系統使用的發音混淆網路(Pronunciation Confusion Networks, PCN),雖能協助辨識使用者的錯誤發音,但PCN的組成是由固定的規則資訊,無法因應使用者多變的發音情形,因此無法正確辨認出任何錯誤混淆的發音。 在本論文中,我們實作兩種方法來改善上述的問題。第一,我們將使用者的母語資訊增加到辨識系統中,使系統能偵測語句中是否發生以母語方式發音的情形,改善系統對參雜母語口音語句的辨識率。第二,讓系統自動偵測使用者個人的發音錯誤型態,動態新增至錯誤規則定義中,將發音混淆網路進行使用者個人化調適,建立個人化發音混淆網路,讓系統能更有效地抓出使用者的錯誤發音。 實驗後顯示,增加母語資訊於辨識系統中,其未能有效改善系統辨識使用者口音的問題,推測原因在於決定對應的方式粗略,又未將聲學模型進行正規調適,於是將不適當的對應模型加入辨識網路,導致徒增加發音混淆網路的複雜度使辨識率降低。而個人化的發音混淆網路,實驗證明能有效輔助系統辨認出固定式PCN無法偵測的發音錯誤情形,並提昇了3.97%的辨識正確率。

並列摘要


Most of the current computer assisted pronunciation training (CAPT) systems apply the concept of pronunciation confusion networks (PCN) for reliably detecting erroneous pronunciations. However, usually the PCN is constructed manually by subjective rules without considering speaker-dependent pronunciation variations. This thesis proposes an approach to alleviate this problem. Firstly, with the mapping between phone segments of the L2 and acoustic models of the mother tongue, the L1-assisted bilingual PCN is incorporated into our CAPT system for detecting errors of accented non-native speech. Secondly, speaker-dependent pronunciation errors are collected by an iterative procedure automatically to construct a personal PCN. Both types of PCNs take L1 and an individual into consideration, trying to construct the overall recognition network for pronunciation training in a flexible and practical way.   Our experiments show that the PPCN yields the best improvement, though the performance of BPCN downgrades when comparing with the PCN baseline system. The possible reason is that the BPCN generated much more speaker-independent paths for individuals, and even worse providing higher perplexity. By using PPCN, the detection rate of erroneous phones of accented non-native speech is increased from 77.6% to 81.57%, when compared to the manually constructed PCN.

參考文獻


[9] Zheng Xu, “The interlanguage phonology of Mandarin Learners of English and the Gradual Learning Algorithm”, In Ninth Conference on Laboratory Phonology, 2004
[12] The CMU Pronouncing Dictionary,
[13] Steve Young, The HTK Book version 3.3, Microsoft Corporation, 2005
[1] Ronen, O., Neumeyer, L., and Franco, H. “Automatic detection of mispronunciation for language instruction,” in Proc. Eurospeech, 1997, pp. 649-652.
[2] Witt, S. M. and Young, S. J., "Off-line Acoustic Modeling of Non-native Accents," in Proc. Eurospeech, 1999, pp. 1367-1370.

延伸閱讀