自動情緒語音辨識是訊號處理領域中一個熱門的研究主題,藉由電腦來辨識反映在人類語音中的情緒,有各種不同的應用。在本論文中,我們將這類技術用在電腦輔助語言教學上,並建立了一個包含生氣,快樂,悲傷,厭煩,一般等五種情緒的中文情緒語音資料庫。對於情緒語音辨識,我們抽取梅爾刻度倒頻譜參數作為情緒特徵,使用最近鄰居分類法做分類,得到平均74.6%的辨識率。在語音情緒評量方面,我們同樣採用梅爾刻度倒頻譜參數作為情緒特徵,並提出修改式最近鄰居法來做語音評量,而對於聽障人士教學,我們設計了一個可以展現情緒強度的情緒蜘蛛網圖,讓聽障人士可以透過圖形化的方式了解自己的表現,最後我們整合以上各項技術,實作出一個電腦聽障語音教學輔助系統。
Automatic emotional speech recognition is a hot topic in signal processing. In this thesis, we build a Mandarin emotional speech database which includes anger, happiness, sadness, boredom, and neutral emotion utterances. We extract the Mel-frequency cepstrum coefficients from each speech as the emotion feature vector. We use K-nearest neighbor method to be our classifier, and obtained 74.6% recognition accuracy. We also proposed a modified K-nearest neighbor method for emotion evaluation. For training the hearing-impaired people to speak naturally, we design an emotion radar chart to present the intensity of each emotion. With the techniques stated above, we implement a computer-assisted speech training system.