  • 學位論文


Modeling the Helpful Opinion Mining of Online Consumer Reviews

指導教授 : 吳世弘


文本探勘和意見探勘是近年新興的研究方向,其中找到一篇意見的極性是一個熱門的話題。但是發現為什麼一個使用者給一個積極或消極看法背後的原因更有趣。隨著 Web 的迅速增長,從其他使用者的批註收集相關資訊,將成為個人及組織決策的必要步驟。 網上消費者的意見挖掘問題,消費者只想讀取有用的意見,有用的評論才決定是否購買產品,為什麼喜歡或討厭一個產品,公司也可以找到真正原因。找出背後的原因,首先要區分句子除了顯示情感之外有沒有給出"有用(Helpful)"或"比較沒有用(Less-Helpful)"。 如果句中有原因,我們認為作者是認真地撰寫評論。我們的研究可以刪除雜訊評論,讓使用者和公司快速地理解其他人為什麼喜歡或不喜歡某個對象。 研究的第一步是打造實驗語料庫,手工收集Amazon 評論分別有八大類別,分別為書、相機、電腦、食物、電影、鞋子、玩具和手機。我們定義句子類型為"Positive Helpful"、"Negative Helpful"與" Positive / Negative Less-Helpful"。Connors的文章中使用人進行分析所提出的"Helpful"或"Less-Helpful"的十個特徵。我們實作了其中八種有用的特徵,可以用來支援我們的研究目標。設計相應的特徵提取程式,分別是“優點和缺點(Pros and Cons)”、“Unigram在產品使用資訊(Unigram of Product Usage Information)” 、“Brigram在(Brigram of Product Usage Information) 、“Trigram在(Trigram of Product Usage Information)、“細節(Detail)”、“比較(Comparisons)”、“長度(Lengthy)”、“評價星星數(Use of Ratings)”。我們可以建立一個分類器,以找出更好的評論。 實驗結果顯示在定義三個類別平均準確性在73%,Helpful negative的準確度74%和召回率64%,Helpful positive的準確度82%和召回率77%,Less-Helpful準確度87%和召回率73%。在所使用的八大特徵中,一一測試每一個特徵,找尋哪個特徵最為重要,經過實驗可以發現“Detail”,重要性為最重要,當去除“Detail”特徵時則準確率降為38.569%。


In the recent researches of text mining and opinion mining, finding the polarity of an opinion is a hot topic. However, the reason why a user gives a positive or a negative opinion is more interesting in the same context. In recent years, with the rapid growth of the Web, gathering information from other user’s comment becomes a necessary step on decision making for people or organization. To mine the reason behind the opinion, we would like to distinguish the sentences as showing emotion with “Helpful” or “Less-Helpful”. If the sentences content the reason, we think that the author is serious to write the reviews. Our research can help the user and company quickly to understand why people like or dislike something and remove noisy reviews. Finding helpful reviews is important. Helpful reviews can give the readers ideals. Noisy reviews just waste time for watching, so reading only helpful reviews not only see reason but also understand quickly. The first step of the research is to create an experiment corpus. We collect Amazon review include Books、Digital_Camera、Computer、FoodsDrinks、Movies、Shoes、Toys and Cell-Phone eight classes. We manually define the sentence types as the one with “Helpful” and “Less-Helpful”. Connors’s paper defines the “Helpful” and “Less-Helpful” features. His paper use 10 features to analysis but it’s not automatically. We implement 8 features. The features are "Pros and Cons"、"Product Usage Information"、"Detail"、"Comparisons"、"Lengthy" and "Use of Ratings". The overall accuracy of three-class problem is about 73%. Helpful negative reviews can be found with 82% precision and 77% recall. Helpful positive reviews can be found with 74% precision and 64% recall. Less-Helpful reviews can be filtered out automatically from all the consumer reviews with a high recall rate about 87% and 73% precision. Second experiment is finding most useful feature. “Detail” is most important of all.


[1] Meng-Xiang Li, Liqiang Huang, Chuan-Hoo Tan, and Kwok-Kee Wei, “Assessing The Helpfulness Of Online Product Review: A Progressive Experimental Approach”, In Proceedings of PACIS, 2011.
[4] Soo-Min Kim, Eduard Hovy, “Extracting Opinions Expressed in Online News Media Text with Opinion Holdersand Topics”, In COLING-ACL06, 2006
[5] Susan M. Mudambi, David Schuff, “What Makes a Helpful Online Review? A Study of Customer Reviews on Amazon.com”, MIS Quarterly, (Vol. No34: 1) pp.185-200, 2010.
[6] Samaneh Moghaddam, Mohsen Jamali and Martin Ester, “Review Recommendation: Personalized Prediction of the Quality of Online Reviews”, Proceedings of the 20th ACM international conference on Information and knowledge management pp.2249-2252, 2010.
[9] Laura Connors, Susan M. Mudambi, and David Schuff. “Is it the Review or the Reviewer? A Multi-Method Approach to Determine the Antecedents of Online Review Helpfulness”, Proceedings of the 2011 Hawaii International Conference on Systems Sciences (HICSS), January, 2011.
