透過您的圖書館登入
IP:3.14.6.194
  • 學位論文

多種語音結構於情緒辨識之初步研究

A Preliminary Study of Various Speech Feature Configurations in Emotion Recognition

指導教授 : 洪志偉

摘要


在本論文中,我們初步介紹了情緒辨識發展的背景、自動情緒辨識之系統架構,以及討論各種語音特徵在情緒辨識中的表現。我們發現,傳統普遍用於語音辨識之梅爾倒頻譜特徵(MFCC),相對於對數頻率功率係數(LFPC)而言,得到的情緒辨識率較差,但當我們把LFPC作離散餘弦轉換,得到對數頻率倒頻譜特徵(LFCC)時,發現LFCC辨識結果優於LFPC,此結果是經由國際知名與通用的情緒語音資料庫 Emotional Prosody Speech and Transcripts所實驗而得,因此極具可信度。我們因而驗證了此符合了語音辨識一般的共識:倒頻譜特徵比對數頻譜特徵在辨識上的表現較佳更具語音鑑別力、無論於語音內容辨識與語音情緒辨識皆是如此。

並列摘要


In this thesis, we briefly introduce several aspects of emotion recognition, including the corresponding background, structure of systems as well as several feature representations. Among the various feature representations, the logarithmic frequency power coefficients (LFPC) behave better than the Mel-frequency cepstral coefficients (MFCC) that are broadly applied in speech recognition. This thesis proposes to further process the LFPC features via a discrete cosine transform (DCT) to reduce the mutual dependence of LPFC features and emphasize the vocal tract information in the speech sound. The resulting new features are named as logarithmic frequency cepstral coefficients (LFCC). The experiments conducted on the well-known emotion recognition database, Emotional Prosody Speech and Transcripts, reveal that the presented LFCC show superior performance in emotion recognition than LFPC and MFCC.

參考文獻


[1] K. R. Scherer, “What are emotions? and how can they be measured?, ” Social Science Information, 44(4), pp. 695-729, 2005.
[2] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz and J. G. Taylor, “Emotion recognition in human-computer interaction,” IEEE Signal Processing Magazine, 18(1), pp. 32-80, 2001.
[3] V. Sethu, E. Ambikairajah and J. Epps, “Speaker normalisation for speech-based emotion detection,” in Proceedings of 15th International Conference on Digital Signal Processing, pp. 611-614, 2007.
[4] Z. Inanoglu and R. Caneel, “Emotive alert: HMM-based emotion detection in voicemail messages,” in Proceedings of the 10th International Conference on Intelligent User Interfaces, pp. 251-253, 2005.
[5] J. S. Park, J. H. Kim and Y. H. Oh, “Feature vector classification based speech emotion recognition for service robots,” IEEE Transactions on Consumer electronics, pp. 1590-1596, 2009.

被引用紀錄


周學雯(2001)。大學生參與運動志工之動機與意願研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-2603200719121135

延伸閱讀