隨著Web 2.0的發展,近年來情感分析也成為相當熱門的研究議題。情感分析的目的即是以自動化的方式獲取電子文本中文字所隱含的情感態度,進而辨識電子文本所欲表達的情感傾向,其中,SentiWordNet即為情感分析中相當重要的情感詞彙資源。SentiWordNet是以WordNet為基礎所發展出的情感詞彙庫,該詞彙庫會賦予WordNet中的每個同義詞集三組分別代表正向、負向與中立的情感極性分數,而情感分析的進行即可透過辨識該詞彙庫所賦予詞彙的情感分數進而對電子文本進行分類。然而SentiWordNet雖可幫助情感分析進行,但仍存在字義辨識的缺點,由於詞彙普遍存在一字多義之問題,在情感分析進行中如何挑選詞彙正確之字義也將影響情感分析成效,而過去使用SentiWordNet進行情感分析的相關研究中,少有學者探討字義辨識問題對情感分析結果的影響。因此,本研究即以SentiWordNet為基礎,透過相關字義辨識方法建置具有領域特性之情感子詞彙庫,用以幫助情感分析中多義詞字義的選擇,並藉由改善SentiWordNet的字義辨識問題,進而提升情感分析的成效。實驗結果證實,相較於使用SentiWordNet進行情感分析,使用本研究改善字義辨識問題後所建置的具領域特性情感詞彙庫,確實能夠提升其分類準確性。
With the development of Web 2.0, sentiment analysis has become a popular research topic in recent years. The goal of sentiment analysis is using an automated method to get implicit sentiment attitudes and correctly identifies the articles to the corrective sentiment orientations. SentiWordNet is an important vocabulary resource in the sentiment analysis. SentiWordnet is based on WordNet. SentiWordNet gives each synset positive, negative, objective sentiment scores. Sentiment analysis can classify the digital text by sentiment score which the synset of SentiWordNet gives. Although, SentiWordNet can help the process of sentiment analysis, it exists disadvantages of word sense disambiguation, which the sense of the word has multiple meanings and further to affect the result of sentiment analysis. In literature, few scholars discuss the problem in the academic field of sentiment analysis. This thesis builds a domain-oriented sentiment lexicon based on SentiWordNet to choice the sense of the word in sentiment analysis, improve word sense disambiguation of SentiWordNet and further to increase the accuracy of sentiment analysis.