根基於 HMM 之華語語音合成初步研究

本研究旨在針對華語語音合成系統進行改進，以根基於HMM之語音合成系統為架構，探討不同的聲學模型：「聲母、韻母標示法」、「聲母、帶聲調韻母標示法」、「音節內右相關標示法」，與音高追蹤方法：「UPDUDP」、「RAPT」，對於語音合成結果的影響，從而實作出自然且流暢的華語語音合成系統。我們採用偏好測試對合成語音進行自然度評估，根據評估結果，最後採用「音節內右相關標示法」作為本系統的聲學模型；「RAPT」作為本系統音高追蹤的方法。所建構完成的華語語音合成系統展示於http://mirlab.org/Demo/TTS/。

關鍵字

隱藏式馬可夫模型；語音合成；聲學模型；音高追蹤

並列摘要

In this study, we focus on improving the performance of Hidden Markov Model-based Text-to-Speech system for Mandarin Chinese to achieve better smoothness and fluency of synthesized speech. Two factors are taken into consideration in our work: the design of acoustic model and pitch tracking algorithm for the training process. We implement three acoustic models, “consonants and vowels”, “consonants and tonal vowels”, and “right context dependent phonemes of syllables”. As for pitch tracking, we compare “RAPT” against “UPDUDP”. We employed preference tests to evaluate the synthesized speech. According to the result, we choose “right context dependent phonemes of syllables” as the acoustic model and “RAPT” as pitch tracking algorithm to construct our speech synthesis system. The implemented system is publicly available at http://mirlab.org/Demo/TTS/.

並列關鍵字

Hidden Markov Model ； Speech Synthesis ； Acoustic Model ； Pitch Tracking

參考文獻

【1】Alan W. Black and Nick Campbell, “Optimising Selection of Units from Speech Databases for Concatenative Synthesis,” in Proc. of EUROSPEECH, pp.581–584, Sep. 1995.

【3】Keiichi Tokuda, Heiga Zen, Junichi Yamagishi, Takashi Masuko, Shinji Sako, Alan W. Black, and Takashi Nose, “The HMM-based Speech Synthesis System (HTS),” http://hts.sp.nitech.ac.jp/ .

【4】Heiga Zen, Takashi Nose, Junichi Yamagishi, Shinji Sako, Takashi Masuko, Alan W. Black, and Keiichi Tokuda, “The HMM-based Speech Synthesis System Version 2.0,” in Proc. of ISCA SSW6, pp.294–299, Aug. 2007.

【5】Toshiaki Fukada, Keiichi Tokuda, Takao Kobayashi and Satoshi Imai, “An Adaptive Algorithm for Melcepstral Analysis of Speech,” in Proc. of ICASSP, vol.1, pp.137–140, 1992.

【6】Julian James Odell, “The Use of Context in Large Vocabulary Speech Recognition,” PhD dissertation, Cambridge University, 1995.

被引用紀錄

唐若華（2010）。基於詞性之斷詞方法以改善華語語音合成系統〔碩士論文，國立清華大學〕。華藝線上圖書館。https://doi.org/10.6843/NTHU.2010.00487

徐培霖（2012）。基於特徵替換法對語者調適語音合成之改進〔碩士論文，國立清華大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0016-2002201315383235

國際替代計量

根基於 HMM 之華語語音合成初步研究

全文下載

主題瀏覽