透過您的圖書館登入
IP:18.227.111.102
  • 學位論文

根基於 HMM 之華語語音合成初步研究

An Initial Study on HMM-based TTS for Mandarin Chinese

指導教授 : 張智星

摘要


本研究旨在針對華語語音合成系統進行改進,以根基於HMM之語音合成系統為架構,探討不同的聲學模型:「聲母、韻母標示法」、「聲母、帶聲調韻母標示法」、「音節內右相關標示法」,與音高追蹤方法:「UPDUDP」、「RAPT」,對於語音合成結果的影響,從而實作出自然且流暢的華語語音合成系統。 我們採用偏好測試對合成語音進行自然度評估,根據評估結果,最後採用「音節內右相關標示法」作為本系統的聲學模型;「RAPT」作為本系統音高追蹤的方法。所建構完成的華語語音合成系統展示於http://mirlab.org/Demo/TTS/。

並列摘要


In this study, we focus on improving the performance of Hidden Markov Model-based Text-to-Speech system for Mandarin Chinese to achieve better smoothness and fluency of synthesized speech. Two factors are taken into consideration in our work: the design of acoustic model and pitch tracking algorithm for the training process. We implement three acoustic models, “consonants and vowels”, “consonants and tonal vowels”, and “right context dependent phonemes of syllables”. As for pitch tracking, we compare “RAPT” against “UPDUDP”. We employed preference tests to evaluate the synthesized speech. According to the result, we choose “right context dependent phonemes of syllables” as the acoustic model and “RAPT” as pitch tracking algorithm to construct our speech synthesis system. The implemented system is publicly available at http://mirlab.org/Demo/TTS/.

參考文獻


【1】Alan W. Black and Nick Campbell, “Optimising Selection of Units from Speech Databases for Concatenative Synthesis,” in Proc. of EUROSPEECH, pp.581–584, Sep. 1995.
【3】Keiichi Tokuda, Heiga Zen, Junichi Yamagishi, Takashi Masuko, Shinji Sako, Alan W. Black, and Takashi Nose, “The HMM-based Speech Synthesis System (HTS),” http://hts.sp.nitech.ac.jp/ .
【4】Heiga Zen, Takashi Nose, Junichi Yamagishi, Shinji Sako, Takashi Masuko, Alan W. Black, and Keiichi Tokuda, “The HMM-based Speech Synthesis System Version 2.0,” in Proc. of ISCA SSW6, pp.294–299, Aug. 2007.
【5】Toshiaki Fukada, Keiichi Tokuda, Takao Kobayashi and Satoshi Imai, “An Adaptive Algorithm for Melcepstral Analysis of Speech,” in Proc. of ICASSP, vol.1, pp.137–140, 1992.
【6】Julian James Odell, “The Use of Context in Large Vocabulary Speech Recognition,” PhD dissertation, Cambridge University, 1995.

被引用紀錄


唐若華(2010)。基於詞性之斷詞方法以改善華語語音合成系統〔碩士論文,國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2010.00487
徐培霖(2012)。基於特徵替換法對語者調適語音合成之改進〔碩士論文,國立清華大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0016-2002201315383235

延伸閱讀