  • 學位論文


A Study of Improving the Classification Performance on Manipulated Online Comments

指導教授 : 陳隆昇


隨著電子商務的發達,網路已經成為收集和分享消費者的個人意見、喜好、經驗與產品的一個很好的平台。基於普及的文本通信工具,客戶可以很容易地表達自己對購買的產品或服務的意見。一般來說,網路評論應該中立反應消費者對產品或服務的經驗。然而,有些評論的看法偏向“操弄”,這可能會降低消費者的購買意願,並帶給企業家極大的損害。儘管操弄評論已廣泛的存在網路評論中,但在文獻中此類的語意分類結果卻很少。本研究旨在透過特徵選取方式找出操弄評論以提高績效。首先,第一部分我們介紹了特徵選取和特徵提取的技術,藉由使用Information gain、Decision Tree、Global-LSI、Local-LSI的重要功能,試圖提高分類的性能。在第二部分中,我們導入了在文獻中整理的11個潛在的虛假特徵,並顯示這些特徵可以顯著的提高檢測準確率和提升績效。在第三部分,我們試圖找出11個潛在的虛假特徵中最重要的特徵屬性。本文實際在網路論壇上蒐集智慧型手機的評論進行實驗,以證實所提出方法的有效性。


With the proliferation of e-commerce, internet has become an ex-cellent platform for gathering and sharing consumers’ personal views on, preferences for, and experiences with products. With the popularity of text based communication tools, customers can easily express their opinions about purchased products or services. Generally speaking, the on-line reviews should be unbiased reflections of the consumers’ expe-riences with the products or services. However, some comments are biased “manipulation”, which might reduce consumers’ purchase in-tentions and bring a great damage to enterprisers. This study aims to improve the performance for manipulation detection through reducing dimension space. The study is divided into three parts. In the first part, we introduce feature selection and feature extraction techniques, the important feature is based on Information gain, Global-LSI, Local-LSI is reduced by the dimension, to improve the detection accuracy further. In the second part, we adopt 11 features recommended in the literature, and show that these features can improve the detection rate significantly. In the third part, we try to find the most important feature attributes in 11 manipulation feature. A real case study of smart phone is used to illustrate the effectiveness of the proposed features.


