透過您的圖書館登入
IP:3.138.172.0
  • 學位論文

中文語音轉換在混合激發線性預測語音編碼器上之實現

Implement Mandarin Speech Conversion on Mixed Excitation Linear Prediction (MELP) CODEC

指導教授 : 賴飛羆

摘要


本篇論文主要的研究方向是將語音變換方法架構在2.4kbps低位元率的混合激發線性預測(Mixed Excitation Linear Prediction)語音編碼器上,以便實際應用在即時通訊之中,增添娛樂性質甚至保密功能。 經由大量語料統計發現,在相同語者說話語音的相同音節發音當中,使用MELP編碼器分析而得的四階線頻譜(Line Spectrum Frequency)參數,其第一階及第二階參數在向量索引(index)的分布上具有多數聚集的特性。本論文提出以音節為基礎的對照方式,建造一來源語者與目標語者的口腔頻譜特徵對照表,以改善因為選錯音節而造成不連續語音的情形;另外線性調整兩語者的基頻週期,改變語者語音的原始激發訊號(Residual Signal);經由模擬實驗結果證實,來源語者確實可以改變成目標語者的效果,而合成語音的品質也令人滿意。

並列摘要


In this work we focused on reusing parameters of 2.4kbps Mixed Excitation Linear Prediction (MELP) voice coder, implement the speech conversion from source speaker to the specified target speaker. Using MELP algorithm to analyze the speech, statistically we found that for the same phoneme of the same speaker, the first and second stage indexes of MELP 4-stage vector quantized Line Spectral Frequency (LSF) tend to collect around some certain index values. We proposed a method that based on Mandarin syllable to build up a mapping table of these indexes between the spectral features of the source and the target speakers. To avoid the discontinued voice that caused by mismatching of the syllable, we proposed a new segmental technique based on feature vector frame. The pitch periods of residual signal were also modified using linear relationship. The simulation results show that the source speaker can be changed to the target speaker, and the quality of synthesized voice is good.

並列關鍵字

MELP Speech Conversion Mandarin syllable

參考文獻


【9】John Puterbaugh ,“Voice Conversion” :
【17】 江佩芳,” 混合激發線性預測語音編碼之研究 ”,國立成功大學碩士論文,2001
【1】M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara. “Voice conversion through vector quantization”. J. Acoust. Soc. Jpn.(E), Vol. 11, No. 2, pp. 71–76, 1990.
【4】Ki Seung Lee, Dae Hee Yun, and Il Whan Cha, “A New Voice Transformation Method based on Both Linear and Nonlinear Prediction Analysis”. The International Conference on Spoken Language Processing,1401-1404,Philadelphia,USA,October 1996.
【5】M. Tamura, T. Masuko, K. Tokuda, and T. Kobayashi. ”Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR”. Proc. ICASSP, pp. 805–808, Salt Lake City, USA, May 2001.

延伸閱讀