透過您的圖書館登入
IP:3.12.161.161
  • 學位論文

運用語音辨識技術在英漢電子字典之設計

The design for electronic dictionary using speech recognition technology

指導教授 : 郭崇仁
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本論文運用隱藏式馬可夫模型及維特比演算法作為語音辨識研究的主要工具。首先將語音訊號做一連串的前置處理,步驟分別為:去除靜音、音框處理、預強調、及加窗處理等,接著再對語音訊號求取特徵參數,而所求取的特徵參數包含倒頻譜參數和差分倒頻譜參數。在特徵參數擷取出來之後,就可以利用隱藏式馬可夫模型及維特比演算法來訓練語音模型並進行待測語音的辨識工作。在去除靜音的步驟裡,我們採用時域端點偵測,此法是根據訊號能量參數(Energy)及越零率參數(Zero Crossing Rate)來做靜音與否的判別。同時我們還發展出双門限端點偵測法,來改善含雜訊語音的端點偵測所產生的誤差。 另外本論文還以Visual Basic 6.0 為發展工具,實際設計了一個語音辨識系統,此系統共分三個部分,分別為「模型訓練」、「辨識語料」、「分析模擬」。在此語音辨識系統中,我們還設計了一個語音英漢字典的功能,提供『語音輸入』的方式,將欲輸入查詢單字的英文字母輸入到電子字典進行翻譯的工作,以簡化查英文單字的工作。

並列摘要


In this dissertation, hidden Markov models and Viterbi algorithm are the major tools for speech recognition research to develop a simulation system of English-Chinese electronic dictionary. First, the speech signal is pre-processed to extract the speech characteristic coefficients through the following steps including cepstral coefficient and differential cepstral coefficient. Furthermore, the hidden Markov models and Viterbi algorithm are proposed to train the vocal models and perform the recognition task. In speech signal, end-point detection process is carried out to discard the silent periods. This method is based on signal energy parameter and zero-crossing rate to discern the speech signal is in silent period or not. We also developed a double threshold end-point detection method to correct the distortion caused by additive noise in the regular end-point detection. In addition, we utilized Visual Basic 6.0 as the development tool to create a vocal recognition system including three parts: model training, speech recognition, and simulation. Based on above system, we also designed an English-Chinese electronic dictionary that offers vocal input. With this function, we simply input the words through your voice not the keyboard to simplify the task of “look it up”.

參考文獻


[1] X.D. Huang and K.F. Lee, “On Speaker-Independent,Speaker-Dependent, and Speaker-Adaptive Speech Recognition,” IEEE Trans on ASSP,1991
[2] P. Woodland, “Speech Recognition,” IEE, 1998
[3] H.Sakoe and S. Chiba, “Dynamic Programming
Optimization for Spoken Word Recognition,” IEEE Trans on ASSP ,Vol.26,pp 43-49,feb.1978.
[4] C. Myers and L.R. Babiner, “Performance Tradeoffs in Dynamic Time Warping Algorithms for Isolated Word Recognition,” IEEE Trans on ASSP, Vol.28,No.6,pp 623-635, Dec. 1980

延伸閱讀