透過您的圖書館登入
IP:18.190.207.156
  • 學位論文

標記傳遞模式應用於中文連續語音關鍵詞辨認系統

Token Passing Model Applied to Continuous Mandarin Keyword Spotting System

指導教授 : 杜筑奎
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


語音輸入操作電腦為語音訊號處理之重要應用,本研究以標記傳遞(Token Passing)模式建構一關鍵詞擷取之語音辨識系統,此系統在Windows 2000 作業系統下,以Microsoft Visual C++ 6.0為系統發展之平台。 本論文使用梅爾倒頻譜係數(MFCC, Mel-Frequency Cepstrum Coefficient)求取特徵參數,連續型隱藏式馬可夫(CHMM, Continuous Hidden Markov Models)建立聲學模型,此外以右相關次音節(Right Context Dependent Sub-Syllabic)模型組成中文415個單音節模型,其中包含113個右相關聲母及39個韻母,聲母以3個狀態,韻母以4個狀態模擬,並建構一靜音模型及短暫停留模型均以4個狀態模擬,使用標記傳遞模式建構出填充模型模式擷取關鍵詞,使用發音確認(Utterance Verification)技術驗證是否為正確之關鍵詞。 最後設計一套以語音輸入為操作介面之選課系統,以階層式方式降低關鍵詞數,以文字轉語音之代理人(Agent)引導使用者,並使用使用確定及拒絕語句以提升關鍵詞之擷取率。

並列摘要


Using speech signal as input to manipulate computer is an important application of speech signal processing research. The system described in our study, is implemented by using Token Passing Model with keyword spotting under Microsoft Visual C++ 6.0 and Windows 2000 operation system. The system utilize the coefficient of Mel-Frequency Cepstrum as the feature parameter, then use the method of CHMM (Continuous Hidden Markov Model) to establish acoustic model. The 415 syllables in Mandarin are further decomposed into right context dependent sub-syllabic units, which are 113 Right Context Dependent INITIAL and 39 Context Independent FINAL. The INITIAL/FINAL are represented by 3-state/4-state. In addition, build a silence and short pause acoustic model, which are represented by 4-state. Then the system uses Token Passing to build a filler model keyword spotting system, and uses the Utterance Verification technology to verify the utterance correct or incorrect. Finally, the research develops a speech input interface system and uses hierarchical architecture method to reduce the number of keyword and uses agent of Text To Speech to lead the users. The system also uses positive and negative sentence to promote the detection rate of keyword spotting.

參考文獻


[1] 李佳慧,不特定語者國語語音字詞辨識系統研究,私立中原大學碩士論文,2001。
[4] 李健平,語音辨認應用於PDA之作業控制研究,私立中原大學碩士論文,2001。
[6] L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice Hall, 1993.
[9] J.R. Deller, Discrete-Time Processing of Speech Signals, Macmillan, 1993.
[12] Mazin G. Rahim, Chin-Hui Lee and Biing-Hwang Juang, “Discriminative Utterance Verification for Conncted Digits Recognition”, IEEE Transactions On Speech And Audio Processing, VOL. 5, pp.266-277, May 1997.

被引用紀錄


林曉銘(2004)。以嵌入式數位信號處理器發展中文語音合成系統之研究〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu200400174

延伸閱讀