透過您的圖書館登入
IP:18.117.216.229
  • 學位論文

改良式梅爾倒頻譜係數混合多種語音特徵之研究

Improved Mel Frequency Cepstral Coefficients Combined with Multiple Speech Features

指導教授 : 莊堯棠
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


本篇論文主要研究的主題是語音辨識系統中的特徵值擷取以及特徵參數補償的部分,前者目的是將不同的特徵值做合併,其中將線性預估倒頻譜係數與梅爾倒頻譜係數結合的效果是最佳的,本論文使用高斯型的梅爾濾波器組來取代原本梅爾倒頻譜係數中的三角濾波器組,而經過實驗證實,將線性預估倒頻譜係數與梅爾倒頻譜係數以1:1的方式做合併效果是最好的,除了將特徵參數做合併之外,本論文還利用倒頻譜平均值與變異數正規化法來補償倒頻譜係數並提升整體系統的辨識效果。

並列摘要


This thesis studies the speech feature extracting and feature compensation in speech recognition. Several speech features are selected for combinations. The best one is cascading Linear Prediction Cepstral Coefficients (LPCC) and Mel-Frequency Cepstral Coefficient (MFCC). The MFCCs used here are obtained by utilizing a Gaussian Mel-Frequency band instead of using a triangular filter bank. And by experiments, it is found that the best combination ratio of LPCC and MFCC is 1:1. The thesis also showed that further improved performance is possible if Cepstral Mean and Variance Normalization (CMVN) is added.

參考文獻


[30]謝宗學,「加成性雜訊環境下運用特徵參數統計補償法於強健性語音辨識」,南投:國立暨南國際大學碩士論文,2006。
[1]J. P. Campbell and JR., “Speaker recognition: a tutorial,” Proceedings of the IEEE , vol. 85, no. 9, pp. 1437-1462, 1997.
[3]B. H. Juang and S. Furui, “Automatic recognition and understanding of spoken language - a first step toward natural human-machine communication,” Proceedings of the IEEE , vol. 88, no. 8, pp. 1142-1165, 2000.
[4]林品宏,「關鍵詞萃取系統及語音聲控車之應用」,桃園:國立中央大學碩士論文,2012。
[5]J. Bradbury, “Linear predictive coding,” Online PDF, pp. 1-23, 2000.

延伸閱讀