本論文以德州儀器之嵌入式數位信號處理系統TMS320C6711 DSK為發展平台,研究中文語音合成(Speech Synthesis)技術,克服硬體資源之限制以實現一個運算量低、記憶體需求量低、合成音質佳且可實現於低成本嵌入式裝置之中文即時語音合成系統。 實驗以次音節(Sub Syllable)為基本合成單元,依口腔運動規則將韻母分成五類以減少合成單元資料庫之資料量,在取樣頻率為8 K Hz,每一取樣點為8 bits之情況下,合成單元資料庫只需佔用74 K Bytes之記憶體空間。合成階段採用時域-基週同步疊加 (Time Domain - Pitch Synchronous Overlap and Add:TD-PSOLA)配合基週尺度函數(Pitch Scaling Function)調整語音之音高(Pitch)及音長(Duration),並以波形插入法(Waveform Interpolation)作頻譜平滑化(Spectral Smoothing)。經實驗證明,以次音節為合成單元之語音合成系統,可獲得良好之語音品質。 以實測驗證研究成果之有效性及可行性,整個合成技術嵌入於TMS320C6711 DSK之內,建立一套實用之數位信號處理系統,以達到操作簡便,正確而快速之中文語音合成結果。
This research use the embeded digital processing system TMS320C6711 DSK to develop a Mandarin Speech Synthesizer. During the development a lot of procedure were taken in order to overcome the limitation of hardware resource. Finally a low operation, low memory requirement, and good performance Mandarin Real-Time Speech Synthesizer is accomplishment. In the experiment, the Sub Syllable is treated as the basic synthesis unit. According to the analysis of oral motion, we classify vowels into five categories to reduce the amount of data of the database. The memory requirement for database is only 74 k Bytes when sample rate is 8 K Hz, and each sample occupy 8 bits. During the synthesis phase, the TD-PSOLA(Time Domain – Pitch Synchronous Overlap and Add) is adopted to adjust the pitch and duration of speech. The waveform interpolation is also used to smooth the spectrum. The experimental result verify that the identification rate of synthesized speech is very high. In order to obtain the accurate result of Mandarin speech synthsis conveniently and quickly, we built a practical digital processing system which embedded in TMS320C6711 DSK digital system board. The experminental result shows that it is effective and practicable.