使用音高資訊來改進日文發音評量

本論文主旨是以加入音高資訊來改進日文發音評量，並使用評量相關的量測方法測試改良後的效能。我們首先加入梅爾倒頻譜係數 (Mel-frequency cepstral coefficients，MFCCs) 和對數能量 (log energy) 特徵，並且利用系統化調整標音的步驟，以更貼近真實發音的標音訓練出基礎語音模型；接著除了 MFCCs 和對數能量，我們再加入音高特徵，用以改良基礎模型，其中音高擷取我們使用 ACF (autocorrelation function) 及 UPDUDP (unbroken pitch determination using dynamic programming) 兩種音高追蹤方法，分別擷取出非連續音高 (broken pitch) 及連續音高 (unbroken pitch)。為測試改良後模型應用在發音評量的效能，我們使用兩種評量相關的測試方法，分別是以排名為基礎的信心度量測和發音錯誤偵測。經實驗，改良後模型的整體評量效能優於基礎語音模型，但其中並非所有音素皆適用加入音高特徵，因此我們再實驗選擇性的載入包含音高特徵的模型或是基礎模型，結果顯示，相較於非選擇性載入模型亦有微幅的評量效能提升。

關鍵字

發音評量；音高資訊；電腦輔助發音訓練；電腦輔助語言學習

並列摘要

The aim of this work is to improve Japanese pronunciation assessment by utilizing pitch information, and the performance of the proposed method is evaluated against several performance measures. Firstly the baseline models are constructed by using MFCCs (Mel-frequency cepstral coefficients) as well as the log energy. The transcriptions are adjusted systematically due to the unique property of Japanese pronunciation. Then we train the improved acoustic models, called pitch-added models, with MFCCs, log energy and pitch. ACF (autocorrelation function) and UPDUDP (unbroken pitch determination using dynamic programming) are adopted as the pitch extraction method to generate a broken pitch contour and an unbroken pitch contour respectively. The performance of the proposed method is evaluated by using ranking-based confidence measure and pronunciation error detection. Experimental results show that the proposed method outperforms the baseline. However, unvoiced phonemes are considered to have no pitch values. It is therefore we try to load the models selectively between the pitch-added models and the original ones, and the experimental results show a slight improvement of the selective approach than the non-selective approach.

並列關鍵字

pronunciation assessment ； pitch information ； CAPT ； CALL

參考文獻

【5】 JANG, J.S.R., CHEN, J.C., AND TSAI, T.L., ”Automatic Pronunciation Assessment for Mandarin Chinese : Approach and System Overview”, Computational Linguistics and Chinese Language Processing, 2007.

【6】 JANG J.S.R., SUN, C.T., AND MIZUTANI, E., “Neural-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence,” Prentice Hall PTR, Upper Saddle River, New Jersey, 1997.

【7】 WITT, S. M., AND YOUNG, S. J., “Phoneme-level Pronunciation Scoring and Assessment for Interactive Language Learning”, Speech Communication 30, 95-108, 2000.

【12】 CUTLER, A., OTAKE, T., “Pitch Accent in Spoken –Word recognition in Japanese”, Acoustical Society of America, 1999.

【13】 RABINER, L., “On the use of autocorrelation analysis for pitch detection”, IEEE Transactions on Acoustics, Speech, and Signal Processing , Vol. 25, No. 1, 24-33, 1977

被引用紀錄

曾泓熹（2011）。以句尾母音模型與鼻濁音發音變異來改善日語語音模型〔碩士論文，國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2011.00537

李宛穎（2011）。使用音高資訊以改進華語發音評量〔碩士論文，國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2011.00051

國際替代計量

使用音高資訊來改進日文發音評量

全文下載

主題瀏覽