使用音高資訊以改進華語發音評量

本論文主旨在於改進華語切音不夠準確之問題。良好的語音模型為自動語音評量的基礎，傳統電腦自動語音評量的步驟為：將錄製的語料利用訓練好的語音模型進行切音，一般利用強迫對位（Forced Alignment）方式，再將切好音的詞句與正確答案進行比對；然而，以往強迫對位方式對於連音，常常會發生切不準的情況，這裡連音定義為字與字之間無短暫停（Short Pause）之連續韻母，例如：蘇武（ㄙㄨㄨˇ）、一意（一ˊ一ˋ）、無謂（ㄨˊㄨㄟˋ）等；這些詞句都會影響切音的準確性，進而影響整體評量效果。因此我們提出利用華語聲調特性，將音高特徵加入訓練，預期能增進連音的切音準確率，並使用三種評估方式評量改良後的模型，分別是整句辨識率、模型排名比率、切音準確率評估，其中切音準確率分成一階段式與二階段式兩種作法；前兩種評估方式為模型可靠度評估，而切音準確率評估為論文重點。結果顯示雖然加入音高特徵使連音在模型排名比率中排名稍微滑落，仍然能幫助提昇切音準確率。

關鍵字

華語語音辨識；切音；聲學模型；強迫對位；音高資訊

並列摘要

This study aims to improve the accuracy of forced alignment for Mandarin Chinese. The performance of automatic speech assessment relies on the quality of acoustic models. The first step of traditional automatic speech assessment is to perform model-based forced alignment on input recording and then compare with the ground truth of acoustic model. However, forced alignment is not accurate enough for co-articulations. Here, we focus on those co-articulations without short pauses between two syllables. For example, 蘇武(“sū wǔ”), 一意(“yi yi”), 無謂(“wu wei”), and so on; the syllable boundaries between the two co-articulated syllables are heavily misaligned and hence impact the quality of the assessment. We therefore propose a new approach using the characteristic of tones in Mandarin Chinese. Additional pitch features are considered to improve the accuracy of forced alignment. Three metrics are evaluated: sentence recognition rate, model ranking ratio, one-pass and two-pass alignment. The first and the second metrics are focus on model reliability. And the third metric emphasize the accuracy of alignment. The results show that the accuracy of alignment is improved while model ranking ratio is slightly down.

並列關鍵字

Mandarin Chinese automatic speech assessment ； forced alignment ； co-articulation ； acoustic model ； phone segmentation

參考文獻

【9】林宏俊, “華語混淆音與耦合音之自動切分”, 國立清華大學, 2008.

【12】董姵汝, “使用音高資訊來改進日文發音評量”, 國立清華大學, 2010

【17】黃士旗, “中文語音聲調辨識的改良與錯誤分析”, 國立清華大學, 2006.

【2】 Jang, J.S.R., Chen, J.C., and Tsai, T.L.,“Automatic Pronunciation Assessment for Mandarin Chinese : Approach and System Overview”, Computational Linguistics and Chinese Language Processing, 2007.

【8】黃怡寧, “華語捲舌音與非捲舌音辨識之研究”, 國立清華大學, 2008.

國際替代計量

使用音高資訊以改進華語發音評量

全文下載

主題瀏覽