透過您的圖書館登入
IP:3.145.10.222
  • 學位論文

中文語音情緒辨識

Emotion Recognition from Mandarin Speech Signals

指導教授 : 包蒼龍
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


在本論文中,我們對人類的五種基本情緒,包含生氣、厭倦、快樂、平常及悲傷,提出一套中文語音情緒的辨識方法。在以語音訊號為基礎的情緒分類中,以往常用的語音特徵大多為基本頻率、音強、音長和音質的統計數值。然而,當需要分辨兩種以上的激發情緒種類時,使用這些特徵值的系統其辨識率會降低許多。對於語音情緒辨識,我們選擇了16個LPC係數、12個LPCC係數、16個LFPC係數、16個PLP係數、20個MFCC係數及抖音做為特徵向量。在實驗中,我們使用兩組本文相關及語者獨立的語料庫。在分類方式上,我們選擇LDA、K-NN及HMMs。實驗結果顯示,在語音情緒分類上,我們所選擇的特徵參數對於喚醒維度和激發維度的情緒種類都是強健且有效的。在LDA的分類方式中,我們得到79.2%的平均辨識率。在K-NN的分類方式中,我們得到83.9%的平均辨識率。在HMMs的情緒分類方式中,我們得到最高的88.1 %的平均辨識率。

並列摘要


In this thesis, a Mandarin speech based emotion classification method is presented. Five archetypal human emotions including anger, boredom, happiness, neutral and sadness are investigated. In emotion classification of speech signals, the conventional features are statistics of fundamental frequency, loudness, duration and voice quality. However, the performance of systems employing these features degrades substantially when more than two valence emotion categories are to be classified. For speech emotion recognition, we select 16 LPC coefficients, 12 LPCC coefficients, 16 LFPC coefficients, 16 PLP coefficients, 20 MFCC components and jitter as the basic features to form the feature vector. Two text-dependent and speaker-independent corpora are employed. The recognizer presented in this thesis is based on three recognition techniques, LDA, the K-NN, and the HMMs. Results show that the selected features are robust and effective in the emotion recognition not only at the arousal degree but also at the valence degree. For the LDA emotion recognition, an average accuracy of 79.2% is obtained. For the K-NN emotion recognition, an average accuracy of 83.9% is obtained. And For the HMMs emotion recognition, the highest average accuracy of 88.1% is achieved.

並列關鍵字

Emotion Recognition LFPC LPC Mandarin MFCC PLP

參考文獻


[5]A. Ortony and T.J. Turner, “What's basic about basic emotions?” Psychological Review, pp. 315-331, 1997.
[7]B.S. Ata, “Effectiveness of Linear Prediction Characteristics of the Speech Wave for Automatic Speaker Identification and Verification,” Journal of the Acoustical Society of America, pp.1304-1312, 1974.
[8]B. Schuller, G. Rigoll, and M. Lang, “Hidden Markov Model-based Speech Emotion Recognition,” Proceedings of IEEE-ICASSP, pp. 401-405, 2003.
[9]C.D. Park and K.B. Sim, “Emotion Recognition and Acoustic Analysis from Speech Signal,” Proceedings of IJCNN, pp. 254-259, 2003.
[10]C.E. Osgood, J.G. Suci and P.H. Tannenbaum, The Measurement of Meaning, University of Illinois Press, pp. 31-75, 1957.

延伸閱讀


國際替代計量