  • 學位論文

辨別虛假評論之研究 -以旅館業為例

Identify the online review with fake content:The case of Hotel Industry

指導教授 : 許秉瑜


隨著互聯網經濟的發展,各類網站上累積了眾多消費者對於各種產品或服務的評論。這些評論成為除官方的產品信息,專家意見以及推薦系統自動產生的個性化建議之外的重要信息來源。調查顯示利用網絡獲取購買信息的行為比例逐年提升,相關研究人員也已經證實消費者對於來自其他使用者的親身經歷的評論越來越重視,並深切的影響了消費者的購買決策。不幸的是一些業者利用這一趨勢蓄意操縱評論,故意誇大自身產品或詆燬競爭對手之商業信譽,對消費者及商業個體都帶來了嚴重的損害。 本研究以虛假評論為研究對象,以語法特徵及文體特徵的角度探討虛假評論與真實評論的差異。以美國網站TripAdvisor旅館業真實評論與對比組虛假評論為分析對象,將評論內容的提取出3個要件:評論中獨特詞彙、明確的數量詞及名詞、動詞比例。運用文字探勘技術對評論進行處理,建立一套可自動分類虛假評論的模型。 本研究所發展出的模型結果顯示,包含獨特詞彙,明確的數量詞及名詞越多,其評論為虛假的可能性則越小。


With the development of the internet economy, various websites accumulate tons of reviews about different product and service. Those reviews have become one major information source besides official product information, expert opinion, and automatically generated individualized advice. The survey shows that percentage of gathering buying information on internet gradually increases by years, and the relevant researchers have also proven that consumers pay more attention to others’ reviews, thus deeply affect consumers’ shopping decision. Unfortunately, by taking advantage of this trend , some dealers manipulate reviews in order to exaggerate their own product or defame their rivals. Those behaviors have brought severe damage to consumers and commerce. This study takes fake reviews as research object, using grammar and style stamp as cutting angle, and discusses the differences between fake reviews and real reviews. Take real reviews on America website “TripAdvisor” and the comparison group “Fake reviews” as analysis objects, and extract 3 major points: unique vocabulary, specific quantifier, and noun verb ratio. Deal those reviews with character prospect technique and build up a model which can automatically classify fake reviews. The result, generated by developed model in this research, shows that the more unique vocabulary and specific quantifier and noun it contains, the less possibility it is fake.


Data mining Text mining Fake reviews Online reviews


[56] 陳韻竹,現代漢語可能性副詞可能性排序之研究,碩士論文:國立台灣師範大學華語文教學研究所,2009。
[3] Jindal N, Liu B. Review Spam Detection [C]. In: Proceedings of the 16th International Conference on World Wide Web. New York, NY, USA: ACM, 2007: 1189-1190.
[5] Wu G, Greene D, Smyth B, et al. Distortion as a Validation Criterion in the Identification of Suspicious Reviews[C]. In: Proceedings of the 1st Workshop on Social Media Analytics. New York, NY, USA: ACM, 2010: 10-13.
[54] Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard and Chih-Jen Lin (2010). Training and testing low-degree polynomial data mappings via linear SVM. J. Machine Learning Research 11: 1471–1490
[4] Mukherjee A, Venkataraman V. What Yelp Fake Review Filter Might Be Doing? [C]. In: Proceedings of the 7th International Conference on Weblogs and Social Media. Palo Alto: AAAI Press, 2013: 409-418.


