透過您的圖書館登入
IP:3.145.184.89
  • 學位論文

國語語音屬性偵測器之製作及其應用

Implementation and applications of Mandarin Pronunciation manner detector

指導教授 : 王逸如

摘要


使用語音屬性偵測架構之語音辨認器在近十年來已重新受到學者之重視,並且有許多的相關之研究正在進行。然而,一般語音屬性偵測器之建立都需要使用監督式的方式訓練,但當訓練語料在缺乏一個已標示正確之答案時,其所製作的偵測器之效能結果多是不佳。本論文使用HMM辨認器之音素標示及英文語音屬性偵測器之結果來得到國語語料之自動音素端點,並以此製作一組可靠的國語語音屬性偵測器。 同時,論文中將所製作之國語語音屬性偵測端點偵測器運用於自發性語料。在自發性語料中有著許多音素省略或是同化之現象,以至於HMM語音辨認器之效能結果不佳。本論文中,使用訓練而得的國語語音屬性偵測器來觀察口語中常見詞彙之音素省略現象。

並列摘要


The importance of speech recognition based on pronunciation detector has been highly recognized in the past decade, and many related researches have been conducted as well. However, a well-established detector model generally requires supervised training, and the effectiveness of detected results have mostly been poor for the lack of correct segmentations during corpus training. In this paper, we use the given results of the segmentations from HMM recognizer as well as the output from the English pronunciation manner detector established by TIMIT corpus to acquire a reliable Mandarin pronunciation manner detector. Meanwhile, this paper also utilizes this Mandarin pronunciation manner detector system in the tests of spontaneous speech. There are many linguistic phenomena, like phone reduction or assimilation, in spontaneous speech, making the HMM recognizer really difficult to achieve the reliable results. In this paper, we use the acquired manner detector system for to observe the common phone-reduction phenomena in spontaneous speech.

並列關鍵字

manner detector boundary refinement

參考文獻


【2】 S. M. Siniscalchi, D-C Lyu, T. Svendsen, and C-H Lee, “Experiments on cross-language attribute detection and phone recognition with minimal target-specific training data,” IEEE Trans. Audio Speech and Language Processing, vol. 20(3), pp. 875-887, 2012.
【3】 S. M. Siniscalchi, and C-H Lee, “A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition,” Speech Communication, vol. 51(11), pp. 1139-1153, 2009.
【4】 P. Schwarz, “Phoneme recognition based on long temporal context,” Ph.D. dissertation, Faculty of Information Technology BUT, 2009.
【7】 V. H. Do, “Hybrid architectures for speech recognition,” Ph.D. dissertation, NTU, 2011.
【8】 S. M. Siniscalchi, P. Schwarz, C-H Lee, “High-accuracy phone recognition by combining high-performance lattice generation and knowledge based rescoring,” Proc. ICASSP2007, vol. 4, pp. 869-872, 2007.

延伸閱讀