透過您的圖書館登入
IP:18.188.211.44
  • 學位論文

考慮語速影響之漢語韻律模型建立與語音合成之應用

A Modeling of Speaking Rate Influences on Mandarin Speech Prosody and its Application to TTS

指導教授 : 王逸如

摘要


本論文提出一個新方法,考慮漢語說話速度對韻律變化的影響,建立一個語速相依的漢語階層式韻律模型(SR-HPM)。本方法修正了先前的非監督式韻律標記與模式(PLM)方法,將語速當作一新的連續獨立變數,讓韻律聲學參數及韻律模型參數受其影響。本研究之SR-HPM建構於一位專業女性播報員所錄製四種不同語速之平行語料庫。實驗結果顯示語速對於模型參數之影響符合現有的語言學知識,證實了本研究所提出之方法能系統化地量化語速對漢語韻律之影響。 最後將本研究所提出之韻律模型應用在文字轉語音上,我們製作了一個可控制語速的中文文字轉語音系統。實驗主觀測試結果顯示,我們所提出之方法在快、慢語速都明顯優於傳統ML為基礎的語速控制方法。

關鍵字

語速 韻律模型 語音合成

並列摘要


In this thseis, a new approach of Mandarin-speech prosody modeling to consider the effects of speaking rate is proposed. The approach is a modification of previous prosody labeling and modeling (PLM) method to take speaking rate as a continuous independent vaiable and let prosodic-acoustic features and some parameters of prosodic models depend on it in order to account for its influences. A speaking rate-dependent hierarchical prosodic model (SR-HPM) is hence constructed from four speech corpra of a single female speaker with four different speaking rates. An analysis of the effects of speaking rate on the model parameters showed that they agreed well with our prior knowledge. So, the proposed approach provides a systematic and effective way to quantify the effects of speaking rate on Mandarin-speech prosody.   Last, an application to the prosody generation for Mandarin text-to-speech (TTS) is proposed. By using the well-trained SR-HPM, a speaking rate-controlled TTS system that can generate fluent speech for any given speaking rate is implemented.The subjective testing results indicated that the proposed methed was significantly better than the conveninal ML-based method for fast and slow rate.

參考文獻


[3] C.-Y. Chiang, C.-C. Tang, H.-M. Yu, Y.-R. Wang and S. H. Chen, “An Investigation on the Mandarin Prosody of a Parallel Multi-Speaking Rate Speech Corpus,” in Proc. Oriental COCOSDA 2009, Aug. 2009, pp. 148-153.
[5] C.-Y. Tseng, “Corpus Phonetic Investigations of Discourse Prosody and Higher Level Information,” LANGUAGE AND LINGUISTICS, Insitute of Linguistics, Vol. 9,No. 3, 2008.
[7] T. Shinozaki and S. Furui, “Hidden Mode HMM Using Bayesian Network for Modeling Speaking Rate Fluctuation,” in Proc. ASRU 2003, Nov. 2003, pp. 417-422.
[8] M. A. Siegler and R. M. Stem, “On the Effects of Speech Rate in Large Vocabulary Speech Recognition Systems,” In Proc. ICASSP'95, May 1995, pp. 612-615.
[9] J. Zheng, H. Franco and A. Stolcke, “Rate-of-Speech Modeling for Large Vocabulary Conversational Speech Recognition,” in Proc. ASRU 2000, Sept. 2002.

被引用紀錄


楊淑雲(2009)。營利事業所得稅查核業務之探討—以博達公司為例〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200901320
黃寶環(2009)。營利事業所得稅結算申報案件運用電腦選案查核之研究〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200900695
楊敏慧(2008)。營利事業佣金支出之研究〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200900665
石雅茹(2005)。查稅評分卡之建立與驗證--以臺灣營利事業所得稅為例〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2005.02706
林湘妍(2009)。兩岸稅務簽證制度之比較研究〔碩士論文,長榮大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0015-2606200910162400

延伸閱讀