透過您的圖書館登入
IP:52.14.240.178
  • 學位論文

使用音高資訊來改進日文發音評量

Improving Japanese Pronunciation Assessment by Utilizing Pitch Information

指導教授 : 張智星

摘要


本論文主旨是以加入音高資訊來改進日文發音評量,並使用評量相關的量測方法測試改良後的效能。 我們首先加入梅爾倒頻譜係數 (Mel-frequency cepstral coefficients,MFCCs) 和對數能量 (log energy) 特徵,並且利用系統化調整標音的步驟,以更貼近真實發音的標音訓練出基礎語音模型;接著除了 MFCCs 和對數能量,我們再加入音高特徵,用以改良基礎模型,其中音高擷取我們使用 ACF (autocorrelation function) 及 UPDUDP (unbroken pitch determination using dynamic programming) 兩種音高追蹤方法,分別擷取出非連續音高 (broken pitch) 及連續音高 (unbroken pitch)。 為測試改良後模型應用在發音評量的效能,我們使用兩種評量相關的測試方法,分別是以排名為基礎的信心度量測和發音錯誤偵測。經實驗,改良後模型的整體評量效能優於基礎語音模型,但其中並非所有音素皆適用加入音高特徵,因此我們再實驗選擇性的載入包含音高特徵的模型或是基礎模型,結果顯示,相較於非選擇性載入模型亦有微幅的評量效能提升。

並列摘要


The aim of this work is to improve Japanese pronunciation assessment by utilizing pitch information, and the performance of the proposed method is evaluated against several performance measures. Firstly the baseline models are constructed by using MFCCs (Mel-frequency cepstral coefficients) as well as the log energy. The transcriptions are adjusted systematically due to the unique property of Japanese pronunciation. Then we train the improved acoustic models, called pitch-added models, with MFCCs, log energy and pitch. ACF (autocorrelation function) and UPDUDP (unbroken pitch determination using dynamic programming) are adopted as the pitch extraction method to generate a broken pitch contour and an unbroken pitch contour respectively. The performance of the proposed method is evaluated by using ranking-based confidence measure and pronunciation error detection. Experimental results show that the proposed method outperforms the baseline. However, unvoiced phonemes are considered to have no pitch values. It is therefore we try to load the models selectively between the pitch-added models and the original ones, and the experimental results show a slight improvement of the selective approach than the non-selective approach.

並列關鍵字

pronunciation assessment pitch information CAPT CALL

參考文獻


【5】 JANG, J.S.R., CHEN, J.C., AND TSAI, T.L., ”Automatic Pronunciation Assessment for Mandarin Chinese : Approach and System Overview”, Computational Linguistics and Chinese Language Processing, 2007.
【6】 JANG J.S.R., SUN, C.T., AND MIZUTANI, E., “Neural-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence,” Prentice Hall PTR, Upper Saddle River, New Jersey, 1997.
【7】 WITT, S. M., AND YOUNG, S. J., “Phoneme-level Pronunciation Scoring and Assessment for Interactive Language Learning”, Speech Communication 30, 95-108, 2000.
【12】 CUTLER, A., OTAKE, T., “Pitch Accent in Spoken –Word recognition in Japanese”, Acoustical Society of America, 1999.
【13】 RABINER, L., “On the use of autocorrelation analysis for pitch detection”, IEEE Transactions on Acoustics, Speech, and Signal Processing , Vol. 25, No. 1, 24-33, 1977

被引用紀錄


曾泓熹(2011)。以句尾母音模型與鼻濁音發音變異來改善日語語音模型〔碩士論文,國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2011.00537
李宛穎(2011)。使用音高資訊以改進華語發音評量〔碩士論文,國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2011.00051

延伸閱讀