此篇論文主要探討以最小錯誤鑑別式研究(Minimum Classification Error, MCE)訓練模型與其他訓練模型的方式比較,並以不同之強健方法提升語音辨識系統中之辨識率。在此研究中,我們對於語料所採用的方式是直接對語料求取改良二維倒頻譜(Two Dimension Cepstrum, TDC )和基因遺傳演算法(Genetic Algorithms, GA),作為語音分析的特徵參數。 在語音系統,訓練時和應用在環境雜訊時不協調,所以有辨識率嚴重的減少。本篇論文將引用最小錯誤鑑別式 (Minimum Classification Error, MCE)強健語料之特徵參數,再利用高斯混合模型(Gaussian Mixture Model, GMM)等不同方法建立語音模型。接著我們利用此系統辨識語音,分別由10人(5男、5女)提供共11000個語音檔,每位語者唸中文數字(0-9)10次,每人選用1040個音檔資訊作為參考音檔,其餘則作為測試音檔。在快速變動之背景噪音情況下測試,於不同強健、建模型之模式中可得其辨識率,最後再加以比較、討論。
The thesis is investigated into training models of Minimum Classification Error (MCE) to compare with other ways, and used different methods of enhancement to improve the performance in the speech recognition system. In the study, we used Modified Two Dimension Cepstrum (MTDC) and Genetic Algorithm to convert the speech data as the features of speech recognition. There is a mismatch between the acoustic conditions of training and applications environment for a speech recognition system, so the performance of the system is seriously degraded. So in this thesis will employ Minimum Classification Error (MCE) based Two Dimension Cepstrum (TDC) to enhance speaker features, then using Gaussian Mixture Model (GMM) to set up speech models. Next, we used the system to identify the speech. We adopted numbers in Chinese (0-9) from 10 speakers (5 males and 5 females), then everyone chanted 10 times for each number (total files: 11000). We selected 1040 files of each one as the training file, the remainder as the testing files. Finally, we compared and discussed the results which are tested in several variable background noises form different conditions.