運用語音辨識技術在英漢電子字典之設計

本論文運用隱藏式馬可夫模型及維特比演算法作為語音辨識研究的主要工具。首先將語音訊號做一連串的前置處理，步驟分別為:去除靜音、音框處理、預強調、及加窗處理等，接著再對語音訊號求取特徵參數，而所求取的特徵參數包含倒頻譜參數和差分倒頻譜參數。在特徵參數擷取出來之後，就可以利用隱藏式馬可夫模型及維特比演算法來訓練語音模型並進行待測語音的辨識工作。在去除靜音的步驟裡，我們採用時域端點偵測，此法是根據訊號能量參數（Energy）及越零率參數（Zero Crossing Rate）來做靜音與否的判別。同時我們還發展出双門限端點偵測法，來改善含雜訊語音的端點偵測所產生的誤差。另外本論文還以Visual Basic 6.0 為發展工具，實際設計了一個語音辨識系統，此系統共分三個部分，分別為「模型訓練」、「辨識語料」、「分析模擬」。在此語音辨識系統中，我們還設計了一個語音英漢字典的功能，提供『語音輸入』的方式，將欲輸入查詢單字的英文字母輸入到電子字典進行翻譯的工作，以簡化查英文單字的工作。

關鍵字

隱藏式馬可夫模型；維特比演算法；語音辨識

並列摘要

In this dissertation, hidden Markov models and Viterbi algorithm are the major tools for speech recognition research to develop a simulation system of English-Chinese electronic dictionary. First, the speech signal is pre-processed to extract the speech characteristic coefficients through the following steps including cepstral coefficient and differential cepstral coefficient. Furthermore, the hidden Markov models and Viterbi algorithm are proposed to train the vocal models and perform the recognition task. In speech signal, end-point detection process is carried out to discard the silent periods. This method is based on signal energy parameter and zero-crossing rate to discern the speech signal is in silent period or not. We also developed a double threshold end-point detection method to correct the distortion caused by additive noise in the regular end-point detection. In addition, we utilized Visual Basic 6.0 as the development tool to create a vocal recognition system including three parts: model training, speech recognition, and simulation. Based on above system, we also designed an English-Chinese electronic dictionary that offers vocal input. With this function, we simply input the words through your voice not the keyboard to simplify the task of “look it up”.

並列關鍵字

viterbi Algorithm ； hidden markov model ； speech recognition

參考文獻

[1] X.D. Huang and K.F. Lee, “On Speaker-Independent,Speaker-Dependent, and Speaker-Adaptive Speech Recognition,” IEEE Trans on ASSP,1991

[2] P. Woodland, “Speech Recognition,” IEE, 1998

[3] H.Sakoe and S. Chiba, “Dynamic Programming

Optimization for Spoken Word Recognition,” IEEE Trans on ASSP ,Vol.26,pp 43-49,feb.1978.

[4] C. Myers and L.R. Babiner, “Performance Tradeoffs in Dynamic Time Warping Algorithms for Isolated Word Recognition,” IEEE Trans on ASSP, Vol.28,No.6,pp 623-635, Dec. 1980

國際替代計量

運用語音辨識技術在英漢電子字典之設計

未授權

主題瀏覽