以嵌入式數位信號處理器發展中文語音合成系統之研究

本論文以德州儀器之嵌入式數位信號處理系統TMS320C6711 DSK為發展平台，研究中文語音合成(Speech Synthesis)技術，克服硬體資源之限制以實現一個運算量低、記憶體需求量低、合成音質佳且可實現於低成本嵌入式裝置之中文即時語音合成系統。實驗以次音節(Sub Syllable)為基本合成單元，依口腔運動規則將韻母分成五類以減少合成單元資料庫之資料量，在取樣頻率為8 K Hz，每一取樣點為8 bits之情況下，合成單元資料庫只需佔用74 K Bytes之記憶體空間。合成階段採用時域-基週同步疊加 (Time Domain - Pitch Synchronous Overlap and Add：TD-PSOLA)配合基週尺度函數(Pitch Scaling Function)調整語音之音高(Pitch)及音長(Duration)，並以波形插入法(Waveform Interpolation)作頻譜平滑化(Spectral Smoothing)。經實驗證明，以次音節為合成單元之語音合成系統，可獲得良好之語音品質。以實測驗證研究成果之有效性及可行性，整個合成技術嵌入於TMS320C6711 DSK之內，建立一套實用之數位信號處理系統，以達到操作簡便，正確而快速之中文語音合成結果。

關鍵字

文字轉語音；波形內插法；基週同步疊加；數位信號處理；嵌入式系統；語音合成

並列摘要

This research use the embeded digital processing system TMS320C6711 DSK to develop a Mandarin Speech Synthesizer. During the development a lot of procedure were taken in order to overcome the limitation of hardware resource. Finally a low operation, low memory requirement, and good performance Mandarin Real-Time Speech Synthesizer is accomplishment. In the experiment, the Sub Syllable is treated as the basic synthesis unit. According to the analysis of oral motion, we classify vowels into five categories to reduce the amount of data of the database. The memory requirement for database is only 74 k Bytes when sample rate is 8 K Hz, and each sample occupy 8 bits. During the synthesis phase, the TD-PSOLA(Time Domain – Pitch Synchronous Overlap and Add) is adopted to adjust the pitch and duration of speech. The waveform interpolation is also used to smooth the spectrum. The experimental result verify that the identification rate of synthesized speech is very high. In order to obtain the accurate result of Mandarin speech synthsis conveniently and quickly, we built a practical digital processing system which embedded in TMS320C6711 DSK digital system board. The experminental result shows that it is effective and practicable.

並列關鍵字

Speech Synthesis ； Embedded System ； TTS ； Digital Signal Processing ； Waveform Interpolation ； PSOLA

參考文獻

【4】邱政湧，標記傳遞模式應用於中文連續語音關鍵詞辨認系統，中原大學資訊工程研究所碩士論文，2003。

【2】梁家銘，小波封包及希爾伯特轉換應用於DSP晶片之語音辨識系統研究，中原大學資訊工程研究所碩士論文，2003。

【3】林政源，國語歌曲的歌聲合成，國立清華大學資訊工程研究所碩士論文，2001。

【9】D. T. Chappell, J. H. L. Hansen, “A comparison of spectral smoothing methods for segment concatenation based speech synthesis,” Speech Communication, Vol.36, NO.3, pp.343-374, March 2002.

【11】朱立平，語音合成器之晶片設計，國立成功大學電機工程研究所碩士論文，2002。

被引用紀錄

王宗彥（2007）。藉由數位訊號處理器來實現的指紋辨識系統〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2007.00871

國際替代計量

以嵌入式數位信號處理器發展中文語音合成系統之研究

未授權

主題瀏覽