雜訊環境下應用線性估測編碼於特徵時序列之強健性語音辨識

在本論文裡，我們提出了一種藉由線性估測編碼來強化語音辨識中特徵之抗噪性的新方法，在此方法中，根據線性估測編碼技術，將語音倒頻譜特徵時間序列分解出估測誤差成分後，將此估測誤差成分從原特徵序列扣除，所得的新特徵序列，相對於原始特徵序列而言，發現具有更佳的雜訊強健性，在Aurora-2此包含各類雜訊之數字語料庫的實驗環境下，經過各種預強健化處理之倒頻譜語音特徵，再進一步藉由我們所提之新方法處理後，都能得到更佳的辨識效能，且在線性估測階數很低的情況下，就可有效提升辨識率，顯示了我們可以高效率地執行實現所提之新技術。

關鍵字

線性估測編碼；特徵時間序列；雜訊強健性

並列摘要

In this paper, we present a novel method to extract noise-robust speech feature representation in speech recognition. This method employs the algorithm of linear predictive coding (LPC) on the feature time series of mel-frequency cepstral coefficients (MFCC). The resulting linear predictive version of the feature time series, in which the linear prediction error component is removed, reveals more noise-robust than the original one, probably because the prediction error portion corresponding to the noise effect is alleviated accordingly. Experiments conducted on the Aurora-2 connected digit database shows that the presented approach can enhance the noise robustness of various types of features in terms of significant improvement in recognition performance under a wide range of noise environments. Furthermore, a low order of linear prediction for the presented method suffices to give promising performance, which implies this method can be implemented in a quite efficient manner.

並列關鍵字

Noise Robustness ； Speech Recognition ； Linear Predictive Coding ； Temporal Filtering

參考文獻

Boll, S. F.(1979).Suppression of acoustic noise in speech using spectral subtraction.IEEE Transactions on Acoustics Speech and Signal Processing.27(2),113-120.

Google Scholar

Chen, C. P.,Bilmes, J.(2007).MVA processing of speech features.IEEE Transactions on Audio Speech and Language Processing.15(1),257-270.

Google Scholar

Deng, L.,Droppo, J.,Acero, A.(2003).Recursive estimation of non-sta-tionary noise using iterative stochastic approximation for robust speech recognition.IEEE Transactions on Speech Audio Process.11(6),568-580.

Google Scholar

Du, J.,Wang, R.(2008).Cepstral shape normalization for robust speech recognition.Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing.4389-4392.

Google Scholar

Furui, S.(1981).Cepstral analysis technique for automatic speaker verification.IEEE Transactions on Acoustics Speech and Signal Processing.29(2),254-272.

Google Scholar

延伸閱讀

陳韋豪（2010）。使用空間-時間之特徵分布資訊於強健性語音辨識之研究〔碩士論文，國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-1610201315181989
李俊彥（1997）。向量和激發線性預估語音編碼之研究與即時實現〔碩士論文，元智大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0009-0112200611362530
葉睿誠（2016）。具陣列拓樸向量校正之多重訊號分類演算法於即時語音處理多聲源切音與分離〔碩士論文，國立交通大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0030-0803201714343394
廖璽元（2013）。雜訊強健技術運用於不同種類之倒頻譜特徵於語音辨識之效能探究〔碩士論文，國立暨南國際大學〕。華藝線上圖書館。https://doi.org/10.6837/NCNU.2013.00149
Deepa, D., Poongodi, C., & Shanmugam, D. A. (2013). The Influence of Speech Enhancement Algorithm in Speech Compression with Voice Excited Linear Predictive Coding. Information Engineering, 2(4), 68-72. https://www.airitilibrary.com/Article/Detail?DocID=P20150609004-201312-201509030024-201509030024-68-72

國際替代計量

雜訊環境下應用線性估測編碼於特徵時序列之強健性語音辨識

全文下載

主題瀏覽