時域-頻域上的聽覺頻譜平滑化之強健性語者辨識

傳統使用的語者辨識系統，辨識率很容易受到加成性雜訊及摺積性雜訊干擾，這是由於傳統上使用的特徵參數只有表達出語句最低層的線索，而較高層的線索被證實出對雜訊較具有抗雜性。本篇論文利用聽覺模型抽取出的語音特徵參數，和時域-頻域調變特性來處理並補捉較高層的線索，最後應用於雜訊下的語者辨識。本論文使用文句不限定及封閉集合語者辨識系統，使用TIMIT和GRID語料庫進行測試，而實驗結果顯示所提出的參數在各個SNR環境下，辨識率比傳統的MFCC參數大大提升；而時域-頻域調變濾波器與最近提出的ANTCC相比，在低SNR下有優異的表現。

關鍵字

聽覺模型；語者識別

並列摘要

The performance of conventional speaker recognition systems is severely compromised by interference, such as additive or convolutional noises. High-level information of the speaker is considered more robust cues for recognizing speakers. This paper proposes an auditory-model based spectral features, auditory cepstral coefficients (ACCs), and a spectro-temporal modulation filtering (STMF) process to capture high-level information for robust speaker recognition. Text-independent closed-set speaker recognition experiments are conducted on TIMIT and GRID corpora to evaluate the robustness of ACCs and benefits of the STMF process. Experimental results show ACCs’ significant improvement over conventional MFCCs in all SNR conditions. The superior performance of STMF to newly developed ANTCCs is also demonstrated in low SNR conditions.

並列關鍵字

auditory model ； speaker identification

參考文獻

[1] D.A. Reynolds, "Speaker Identiﬁcation and Veriﬁcation using Gaussian Mixture Speaker Models," Speech Comm., vol. 17, pp. 91–108, 1995.

[2] T. Kinnunen and H. Li, "An overview of text-independent speaker recognition: from features to supervectors," Speech Comm., vol. 52, pp. 12–40, 2010.

[3] D.A. Reynolds, et al., "The SuperSID project: exploiting high-level information for high-accuracy speaker recognition," in Proc. ICASSP, pp. 784-787, 2003.

[4] R. Saeidi, J. Pohjalainen, T. Kinnunen, P. Alku, “Temporally Weighted Linear Prediction Features for Tackling Additive Noise in Speaker Verification”, in IEEE Signal Processing Letters, vol. 17, pp. 599-602, 2010.

[5] J. Ming, et al., "Robust speaker recognition in noisy conditions," IEEE trans. on Audio, Speech, and Language processing, vol. 15, no. 5, pp. 1711–1723, July, 2007.

被引用紀錄

謝東閔（2012）。互動式電子白板對國小中度智能障礙學生錢幣使用教學成效之探究〔碩士論文，中原大學〕。華藝線上圖書館。https://doi.org/10.6840/CYCU.2012.00569

李麗芬（2012）。運用未來教室對國小二年級數學乘法教學之研究〔碩士論文，國立臺北科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0006-1107201217182800

國際替代計量

時域-頻域上的聽覺頻譜平滑化之強健性語者辨識

全文下載

主題瀏覽