透過您的圖書館登入
IP:3.22.249.158
  • 學位論文

雜訊環境下經驗模態分解法於語音辨識之應用

An Application of Empirical Mode Decomposition Method to Speech Recognition in Noisy Environment

指導教授 : 莊堯棠
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在本論文中,我們應用 黃鍔博士所提出的經驗模態分解法(Empirical Mode Decomposition, EMD),利用信號內部變化的時間尺度作為能量與頻率的直接析出,可將信號分解成數個本質模態函數(Intrinsic Mode Function, IMF)之組合,其基底含有信號中不同尺度的特性,能夠表達信號中之物理特性。利用基底中的資訊應用在關鍵詞萃取技術,改善在低訊雜比、均勻分布(uniformly distributed)的白雜訊之環境的辨識率。 我們利用 黃鍔博士所提出的經驗模態分解法,藉由所析出的第一個本質模態函數,對雜訊環境下的語音信號做去除雜訊的定性與定量的初步分析,在「前端處理」時去除在語音信號中的部分雜訊,辨識過程中可以改善辨識率,達到類似語音增強(speech enhancement)的效果。此外,我們也利用經驗模態分解法所析出的第一個本質模態函數估測信號的訊雜比,藉以進行「模型切換」的動作,決定在辨識階段時所需要的語音模型,實驗結果也發現可以改善系統在低訊雜比環境下的辨識率。 最後,我們將上述2種方法結合,希望能夠再改善系統辨識率。經由實驗的結果,我們可以正確地估算測試語料的訊雜比屬於在哪各區間,並在辨識階段切換至該區間較佳的語音模型進行辨識,切換正確率可達到97.95%。經由此方式我們在訊雜比SNR = 0dB與SNR = 10dB時,可分別達到相對改善率為56.25%與27.56%。

並列摘要


In this thesis, we study the Dr. Huang''s Empirical Mode Decomposition method, EMD, which use yardstick change of time within signals to resolve signals into the combination of several Intrinsic Mode Functions, IMFs. IMFs contain different characteristics of signals and can express the physical characteristic in signals. We apply the information of the first IMF to the keyword spotting technique, and found that can improve recognition rate in different uniformly distributed SNRs of white noise environment. We apply EMD method to speech signals and make noise reduction procedure in the front-end processing according to qualitative and quantitative initially analysis of the first IMF. This method can improve the recognition rate in noisy conditions and get results like speech enhancement. In addition, we use the information of the first IMF to estimate SNR of a speech signal and switching system to the better acoustic model in recognition stage. Experimental results found that can improve recognition rate in low SNR environment. Finally, above-mentioned two kinds of methods are combined to improve the recognition rate systematically again. Results show we can estimate correctly test material in which SNR condition and switch system to the better acoustic model in recognition stage. By this way, we can switch correctly up to 97.95% and reach relative improvement 56.25% and 27.56% at SNR=0dB and SNR=10dB conditions respectively.

參考文獻


Electrical Engineers, vol. 26, pp. 429-457, 1993, 1946.
[2] E. Bedrosian, “A product theorem for Hilbert transform,” Proc. of IEEE, vol.
51, pp. 868-869, 1963.
4, April 1983.
[4] H. Ney, “The use of a one stage dynamic programming algorithm for connected

被引用紀錄


林宜亭(2009)。運用多尺度熵於聲帶手術住院病人術後麻醉身體平衡能力恢復之研究〔碩士論文,元智大學〕。華藝線上圖書館。https://doi.org/10.6838/YZU.2009.00089
吳秋燕(2007)。一維經驗模態分解法於TFT-LCD面板影像之 Mura (光源不均)瑕疵檢測〔碩士論文,元智大學〕。華藝線上圖書館。https://doi.org/10.6838/YZU.2007.00136
郭建成(2007)。經驗模態分解應用於敲擊回音法之鋼筋與裂縫辨識〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2007.02683
劉彥宏(2007)。利用SVM結合多重貝氏網路之適性學習系統研發─以國小數學領域分數的乘法為例〔碩士論文,亞洲大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0118-0807200916283601
溫家誠(2008)。多媒體應用之語音辨識系統〔碩士論文,國立中央大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0031-0207200917352440

延伸閱讀