透過您的圖書館登入
IP:18.218.189.170
  • 學位論文

以SentiWordNet為基礎建構具領域特性之情感詞彙庫

Building a domain-oriented sentiment lexicon based on SentiWordNet

指導教授 : 洪智力
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


隨著Web 2.0的發展,近年來情感分析也成為相當熱門的研究議題。情感分析的目的即是以自動化的方式獲取電子文本中文字所隱含的情感態度,進而辨識電子文本所欲表達的情感傾向,其中,SentiWordNet即為情感分析中相當重要的情感詞彙資源。SentiWordNet是以WordNet為基礎所發展出的情感詞彙庫,該詞彙庫會賦予WordNet中的每個同義詞集三組分別代表正向、負向與中立的情感極性分數,而情感分析的進行即可透過辨識該詞彙庫所賦予詞彙的情感分數進而對電子文本進行分類。然而SentiWordNet雖可幫助情感分析進行,但仍存在字義辨識的缺點,由於詞彙普遍存在一字多義之問題,在情感分析進行中如何挑選詞彙正確之字義也將影響情感分析成效,而過去使用SentiWordNet進行情感分析的相關研究中,少有學者探討字義辨識問題對情感分析結果的影響。因此,本研究即以SentiWordNet為基礎,透過相關字義辨識方法建置具有領域特性之情感子詞彙庫,用以幫助情感分析中多義詞字義的選擇,並藉由改善SentiWordNet的字義辨識問題,進而提升情感分析的成效。實驗結果證實,相較於使用SentiWordNet進行情感分析,使用本研究改善字義辨識問題後所建置的具領域特性情感詞彙庫,確實能夠提升其分類準確性。

並列摘要


With the development of Web 2.0, sentiment analysis has become a popular research topic in recent years. The goal of sentiment analysis is using an automated method to get implicit sentiment attitudes and correctly identifies the articles to the corrective sentiment orientations. SentiWordNet is an important vocabulary resource in the sentiment analysis. SentiWordnet is based on WordNet. SentiWordNet gives each synset positive, negative, objective sentiment scores. Sentiment analysis can classify the digital text by sentiment score which the synset of SentiWordNet gives. Although, SentiWordNet can help the process of sentiment analysis, it exists disadvantages of word sense disambiguation, which the sense of the word has multiple meanings and further to affect the result of sentiment analysis. In literature, few scholars discuss the problem in the academic field of sentiment analysis. This thesis builds a domain-oriented sentiment lexicon based on SentiWordNet to choice the sense of the word in sentiment analysis, improve word sense disambiguation of SentiWordNet and further to increase the accuracy of sentiment analysis.

參考文獻


Bird, S., Klein, E., & Loper, E. (Ed.). (2009). Natural Language Processing with Python. Sebastopol, CA: O'Reilly Media.
Denecke, K. (2008, April). Using SentiWordNet for multilingual sentiment analysis. In U. Dayal(Chair). 24th International Conference on Data Engineering Workshop. Symposium conducted at the meeting of the Cancun, Mexico.
Gale, W. A., Church, K. W., & Yarowsky, D. (1992). A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26, 415-439.
Godes, D., & Mayzlin, D. (2004). Using online conversations to study word-of-mouth communication. Marketing Science, 23(4), 545-560.
He, Y., & Zhou, D. (2011). Self-training from labeled features for sentiment analysis. Information Processing and Management,47 (2), 606-616.

被引用紀錄


鍾瑞嘉(2017)。一個以集成為基礎的口碑情感分類框架〔碩士論文,中原大學〕。華藝線上圖書館。https://doi.org/10.6840/cycu201700906

延伸閱讀