透過您的圖書館登入
IP:18.191.171.235
  • 學位論文

基於潛藏性韻律模型之台灣國語與大陸普通話腔調辨認

Latent Prosody Model-based Taiwan and Mainland Mandarin accent recognition

指導教授 : 廖元甫

摘要


本論文探討韻律資訊如何應用在辨認台灣國語及大陸普通話腔調上。我們使用基頻值、能量與音節長度建立潛藏性韻律模型(LPM),並用其加以描述韻律變化,以幫助分辨台灣國語與大陸普通話腔調的不同。 我們使用MAT與TRSC語料庫進行實驗,使用傳統的音素辨認器後接語言模型(PPRLM)、通用音素辨認器後接語言模型(UPRLM)與位移差分化倒頻譜特徵高斯混合模型(SDC-GMM)的錯誤率介於23.79%至29.11%之間,將三個系統整合,錯誤率能下降到20.68%。如果再把LPM架構的語言辨認系統與前面的系統都整合,錯誤率則可下降到16.18%,因此LPM對於分辨台灣國語與大陸普通話腔調是相當有幫助的。

並列摘要


This paper address the problem of how to use prosody information to improve the efficiency of accent recognition systems. We use the parameters like pitch, energy and duration to construct a latent prosody model without speaker effect. Use this state to state probability to describe language prosody information. We use the Mandarin across Taiwan Cropus and 500 people Telephone Read Speech Cropus to verify our method. If we use PPR-LM UPR-LM and SDC-GMM then we get error rate between 23.79% and 29.11%, and if we combine PPR-LM, UPR-LM and SDC-GMM can improve the error rate to 20.68%, in the end we combine the three system and LPM-LM system to improve the error rate to 16.18%, so we can say this system can help us to separate Mainland and Taiwan accent.

參考文獻


[1] Rong Tong, Bin Ma, Donglai Zhu , Haizhou Li and Eng Siong Chng, “Inte- grating Acoustic, Prosodic and Phonotactic Features for Spoken Language Identification,” ICASSP, 2006.
[4] Chiu-yu Tseng, Shao-huang Pin, Yeh-lin Lee, Hsin-min Wang and Yong-cheng Chen (2005). “Fluent speech prosody: framework and modeling,” in Speech Communication.
[5] Haizhou Li, Bin Ma, and ChinHui Lee, “A vector space modeling approach to spoken language identification,” IEEE Trans. Speech Audio Process., vol.15, no.1, pp.271–284, Jan. 2006.
[8] HTK Speech Recognition Toolkit, http://htk.eng.cam.ac.uk/
[10] LNKnet Pattern Classification Software, http://www.ll.mit.edu/IST/lnknet/

延伸閱讀