透過您的圖書館登入
IP:13.59.82.167
  • 學位論文

利用領域概念與口碑評價改善文章情感分類

Improvement of WOM Sentiment Classification Based on Integration of Domain Concepts and Ranking Mechanisms

指導教授 : 洪智力

摘要


隨著網際網路與web2.0的興起,越來越多豐富有價值的意見資源可供人們參考, 然而在這資訊超載的時代,使用者無法一一的篩選解讀何者是有利用價值的資訊,所 以人們試著用機器去歸納理解文字中的含意。情感分析的主要目的在於偵測文字裡面 主觀的資訊如觀點、喜好、態度等,從中萃取出所需的商業情報、社會心理或其他應 用的豐富資訊,供決策者參考並成為重要的情報來源。在研究上時常使用情感語料庫 做為情感分類的依據,目前傳統上認為一個涵蓋率廣且正確的情感語料庫可以有效的 解讀文章的語意。然而,傳統的靜態情感語料庫是固定已標註完成的,無法隨著領域 性、隨著時代改變做調整與修正。固定的文字情感極性成分,是沒有辦法適應所有文 章內容的,例如文字在不同領域下的用法差異,時下具情感意涵的流行字詞也未被蒐 集於靜態語料庫中。本研究以文章內容為本建置適應性語料庫,嘗試以動態的方式更 新其情感極性成分,並利用領域概念與口碑評價兩項因素,試圖找出更有效並具領域 性的文章情感分類方法。在特徵擷取部分,本研究認為在文章中具概念的字詞較具意 義與代表性,因此使用概念網ConceptNet 做概念擷取。本研究提供之方法使情感分 類時更符合不同領域特徵,且不受傳統語料庫限制,更能因時制宜,提供一個在預測 情感文章傾向時的不同方法。實驗結果顯示,運用本研究之適應性語料庫配合概念網 ConceptNet 做情感分類可以得到更高的分類準確率。

並列摘要


Abstract This paper provides an article sentiment analysis approach, which combines domain concepts and reputation evaluation to build an adaptive corpus, and use it for improving article sentiment classification. The main task of sentiment analysis is detect subjective information in the text such as views, preferences, attitudes, etc..., which extract the useful information for user to make a strategic decision. We often use sentiment corpus as our basis for the sentiment analysis, however, the traditional static sentiment corpus is fixed and has been marked completely, and it cannot be adjusted with time and various domain. Fixed emotional polarity cannot adapt to all the content of the article, such as the use of the differences in different domains. In this paper, we build the article content based on adaptive corpus; purpose is to dynamically update its emotional polarities and components, and the use of two factors, domain concepts and reputation evaluation; the goal is to find a more effective way to prove the article of sentiment classification. In the feature extraction part, we suggest that the concept of a word in the article is far more meaningful and also more descriptive. Our method is to use a common sensed database, such as ConceptNet to do the feature extraction work. With ConceptNet, it makes sentiment classification able to adapt to more diverse domains and put itself out of the traditional corpus’ restrictions at the same time. We provide a different approach in the prediction of the emotional tendency of the article. Experimental results show that by using the adaptive corpus, and ConceptNet feature, a higher accuracy of sentiment classification can be obtained.

參考文獻


李佳穎(2009). 意見持有者辨識及其意見立場分析. 臺灣大學資訊工程學研究所學位
謝鎮宇(2010). 意見探勘在中文評鑑語料之應用. 交通大學資訊學院資訊學程研究所
羅佳玲(2009). 同步式關鍵字萃取方法應用於美妝評論.元智大學資訊管理學系學位
簡之文(2012). 部落格文章情感分析之研究. 淡江大學資訊管理學系碩士論文, 1-59.
resource for sentiment analysis and opinion mining, In International Conference

延伸閱讀