語言是人與人之間溝通的橋樑,在人際關係與職場中都扮演著重要的角色。人類情緒表現伴隨著語音的產生,因此人類的語音不僅僅代表當下人們想表達的意思,它還帶有情緒的訊息。 在本篇論文裡,我們嘗試去比較不同的連續中文語音的切割方式對於辨識率的影響,並嘗試在連續語音中,找到人類情緒轉折點。在實驗過程中,分別以均勻切割、端點偵測及完整句子這三種不同的切割方式進行連續中文語音之切割。情緒資料庫包含了5種情緒的語句,分別是生氣,快樂,悲傷,厭煩和一般。分類器則是使用Weighted Discrete-KNN (WDKNN)和K-Nearest Neighbor (KNN)兩種,並各自搭配最佳化特徵組合來進行情緒辨識,經實驗得出WD-KNN可得到較佳的分類結果,平均辨識率為 73%。
Language is the bridge of communication between human beings, and the speech presenting language always plays an important role in the personal relationship and association. The human emotion usually accompanies with speech, so human speech identification is not only involving language syntax and meaning but also the emotion at that moment. In this thesis, we try to compare the long sentence speech corpus emotion recognition rate using various speech segmentation approaches, and try to detect the emotion transition point from the continuous speech. In the experiments, we apply three different speech segmentation methods to the continuous Mandarin emotional speech. These methods include uniform, endpoint detection, and whole sentence segmentation. There are five emotions being investigated, including anger, happiness, sadness, boredom, and neutral. We then employ two classification algorithms in the recognition phase, including the weighted discrete-KNN (WD-KNN) and conventional K-nearest neighbor (KNN). From to the experimental result, we find that the WD-KNN can yield better recognition result than the other. The average recognition accuracy is 73% for the testing sentences.