透過您的圖書館登入
IP:18.216.147.211
  • 學位論文

基於中文情緒構面模型之電影評論意見分析

Valence-Arousal Dimension-based Opinion Mining for Movie Reviews

指導教授 : 許聞廉

摘要


自從網路2.0服務開放,越來越多的資訊充斥著我們的生活,同時,以使用者為導向的言論、意見開始透過社群網站分享交流。近年來資料探勘、機器學習、自然語言處理等領域蓬勃發展開始針對現行的資料進行語意、情緒分析。在商業行為上,幫助決策者預測市場動向及掌握潛在顧客。同時也涉及民生、醫療、氣候等多種議題。 本研究延伸 Russell (1980) 所提出的情緒構面概念,以Valence表示詞彙的正負極性,Arousal表示詞彙的情緒程度,在情緒構面中,任一中文詞彙皆有Valence值及Arousal值,分數區間為1至10。我們進一步使用情緒構面針對電影評論進行意見分析,判斷使用者是否推薦該部電影。首先我們蒐集PTT Movie版上標示「好雷」及「負雷」的電影評論文章,並將文章進行自然語言處理,最後透過分散式關鍵詞向量的方法表示訓練和測試資料的特徵。此外我們將此方法和現行在情感分析中著名的方法比較,如:LDA-SVM、Naïve Bayes、K-NN、tf-idf及Delta tf-idf用以判斷其成效。 研究結果顯示我們的方法在情緒構面的預測上表現優異,不僅可以針對新興的中文詞彙預測Valence值和Arousal值,並且在目前眾多的方法、模型中達到最優異的表現。而在電影評論的意見分析上,考量到撰文者所使用詞彙背後的情緒極性及程度性進一步幫助我們準確抓取文章的核心,在成效上勝過目前常用的情感分析方法,達到85.5%的準確率。 關鍵字:意見分析、情緒構面、本體論、詞嵌入、電影評論、情感分析

並列摘要


Since Web 2.0 service began, more and more information has been filled with our lives. At the same time, user’s comments and opinions began sharing through social media. Data mining, Machine learning, Natural language processing fields start to focus on semantic and emotion analysis in recent years. In business conduct, these techniques help decision maker predict market trends and discover potential customers. It also involve people’s welfare, medical, climate and other issues. In this study, we extend the dimensional theory of emotion which Russell proposed in 1980. Valence indicates the positive and negative polarity of the word and Arousal indicates the emotion degree of the word. In Valence and Arousal dimension, any Chinese word contains Valence value and Arousal value, both value range are from 1 to 10. We further apply Valence and Arousal dimension to analyze on Chinese movie reviews. To determine whether the user recommend the movie or not, we collected movie reviews from PTT movie forum and processed them with nature language processing approaches. Finally, we use distributed keyword vectors to represent training and testing features. We also compare our method to evaluate its performance with the well-known methods such as LDA-SVM, Naïve Bayes, K-NN, tf-idf and Delta tf-idf in sentiment analysis. The experimental results show our method can achieve the best performance on Valence and Arousal prediction. Also the method can predict unknown word’s Valence value and Arousal value . In opinion mining for movie reviews, our method can consider writer’s emotion polarity and degree. As a result, our method can help us grasp the core of the article accurately and achieve 85.3% accuracy in performance. Keywords:Opinion Mining, Valence&Arousal Dimension, Ontology, Word Embeddings, Movie Review, Sentiment Analysis

參考文獻


[11] 李政儒, 游基鑫, and 陳信希, "廣義知網詞彙意見極性的預測," Computational Linguistics and Chinese Language Processing, pp. 21-36, 2012.
[1] P. Kalaivani and K. L. Shunmuganathan, "SENTIMENT CLASSIFICATION OF MOVIE REVIEWS BY SUPERVISED MACHINE LEARNING APPROACHES," Indian Journal of Computer Science and Engineering, vol. 4.4 pp. 285-292, 2013.
[2] T. R. Gruber, "A translation approach to portable ontology specifications," Knowl. Acquis., vol. 5, pp. 199-220, 1993.
[3] 鍾明強, "基於Ontology架構之文件分類網路服務研究與建構," 碩士, 資訊工程學系碩博士班, 國立成功大學, 台南市, 2004.
[5] N. Shadbolt, T. Berners-Lee, and W. Hall, "The Semantic Web Revisited," IEEE Intelligent Systems, vol. 21, pp. 96-101, 2006.

延伸閱讀