基於特徵替換法對語者調適語音合成之改進

本論文實作一線上語者調適及中文語音合成系統並提出特徵替換法用以改善合成語音。使用者在此系統中輸入欲合成文字，此系統會為該段文字進行斷詞、標聲調，以使用者選擇之聲學模型進行語音合成。此系統也提供語者調適的功能，使用者在線上進行錄音，此系統依據文本及音檔進行語音評分，決定是否接受此語料。使用者錄製完畢後，系統後臺程式自動進行語者調適，訓練該使用者之聲學模型。此外，本論文針對語者調適之合成語音，提出一個使用特徵替換的方法來改善其效果。這個方法使用真實語音片段的頻譜特徵，取代由聲學模型估計的頻譜特徵，藉此提升合成音檔與目標語者發音的相似度。在MOS評分中此方法較原始語者調適合成音檔的分數高了0.4分。

關鍵字

語音合成；語者調適；文字轉語音

並列摘要

This study implements an online Mandarin speech synthesis system with speaker adaptation and proposes a speech feature substitution approach to improve the quality of the synthesized speech. The system takes texts provided by users as input and performs POS and tone tagging. The synthesis can be done with the acoustic models of users’ choices. This system also provides a speaker adaptation function. First, the user is asked to record a few sentences through a web interface. A speech scoring technique is used to validate the quality of the recorded utterances. The system then uses these utterances to perform speaker adaptation to adjust the acoustic models for speech synthesis. Moreover, this study proposes a speech feature substitution method to improve the quality of speaker adaptation. This method adopts the spectral features extracted from real speech utterances instead of estimating them from acoustic models. The similarity between the synthesized speech and target speech is therefore increased. The experimental result shows that the proposed method is able to improve upon the original method with an 0.4 increase in MOS score.

並列關鍵字

speech synthesis ； speaker adaptaion ； text-to-speech

參考文獻

【12】林政源，「應用於文字轉語音系統的語者調適方法回顧」, Vol.139, 電腦與通訊, 2011

【13】唐若華，張智星，「基於詞性之斷詞方法以改善華語語音合成系統」，國立清華大學資訊工程學系碩士論文，2010。

【15】吳尚鴻，王小川，「基於隱藏式馬可夫模型之中文語音合成與吼叫情緒轉換」，2010

【1】 F. C. Chou, C. Y. Tseng, and L. S. Lee, “A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese,” IEEE Trans. on Speech and Audio Processing, vol. 10, pp. 481–494, 2002.

【2】 A. Hunt and A. Black, “Unit selection in a concatenative speechsynthesis system using a large speech database” , ICASSP, pp. 373–376, 1996.

主題瀏覽