  • 學位論文


A Semi-Supervised Approach for Sentiment Analysis in Chinese Web Content

指導教授 : 邱昭彰


語意傾向分類為Opinion Mining task中極重要且不可或缺的一環。 本研究目的為設計一套語意傾向分類系統架構,輔以改良的SO-PMI (Semantic Orientation - Pointwise Mutual Information) Algorithm,將網路上的產品評論句(sentence level)依照三種不同的語意傾向(positive, negative, neutral)加以分類。 實驗資料來源為美妝領域的產品評論文章。 實驗結果證實,在繁體中文的語料環境下,本研究提出的分類架構較原始的SO-PMI Algorithm具有較佳的分類效果。 且在分類成效改進上,此分類架構中的每一階段處理步驟皆可使分類效果穩定的成長。


Developing an effective and automatic mechanism for Internet opinion analysis (or opinion mining) is important research issue. In opinion mining framework, “Sentiment Classification” is one of the most critical tasks. Therefore, we proposed a modified SO-PMI (Semantic Orientation - Pointwise Mutual Information) algorithm (Turney, 2002) to classify s orientation in Chinese web content. We applied this method to sentence level, 3 categories (positive, neutral, negative) sentiment classification problem.. Our experimental results indicate that 1. The slanting phenomenon still occurs in Chinese environment. 2. We achieve 65.93% accuracy at sentiment classification task on cosmetics domain. It outperforms the baseline, and Stable and continuous improvement in each step.


11.N. Jindal, and B. Liu (2006b). Mining Comparative Sentences and Relations. Proceedings of the National Conference on Artificial Intelligence, Boston, MA.
12.N. Jindal, and B. Liu (2007). Analyzing and Detecting Review Spam. IEEE International Conference on Data Mining, ICDM 2007, Omaha, NE, IEEE Computer Society.
21.P.D. Turney (2001). Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL. Machine Learning: ECML 2001: 491-502.
23.G. WANG, and K. ARAKI. "An Unsupervised Opinion Mining Approach for Japanese Weblog Reputation Information Using an Improved SO-PMI Algorithm," IEICE Transactions on Information and Systems E91-D(4), 2008, pp. 1032-1041.
1.A. Abbasi, H. Chen, and A. Salem. "Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums," ACM Transactions on Information Systems 26(3), 2008, pp. 1-34.


