Translated Titles

Analysis and Detection of Online Paid Restaurant Reviews





Key Words

垃圾意見評論 ; 虛假意見評論 ; 部落格 ; 業配文 ; 網路寫手 ; 薦證廣告 ; opinion spam ; blog ; testimonial



Volume or Term/Year and Month of Publication


Academic Degree Category




Content Language


Chinese Abstract


English Abstract

In recent years, people get used to sharing their opinions and experiences on the Internet. These opinions greatly influence our decisions. For example, most people read online reviews before they make purchases. Malicious companies or individuals make use of fake reviews to control the opinions on social media and blogs. In this thesis, we collect paid reviews on Pixnet and understand this type of promotion campaigns. Some characteristics of paid reviews and writers are found. We then propose a set of features based on our observation and detect paid reviews and writers using supervised machine learning techniques. Our results demonstrate the effectiveness of features and outperform random baseline significantly. Furthermore, a collective detection method using Markov Random Fields is proposed to detect paid reviews and writers seamlessly. The collective detection method can utilize the relations among review and user instances. The results outperform the performance of separate detections.

Topic Category 基礎與應用科學 > 資訊科學
電機資訊學院 > 資訊工程學研究所
  1. Leman Akoglu, Rishi Chandy and Christos Faloutsos, “Opinion Fraud Detection in Online Reviews by Network Effects”, Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, Association for the Advancement of Artificial Intelligence, 2013.
  2. Enrico Blanzieri, Anton Bryl, “A survey of learning-based techniques of email spam filtering”, Artificial Intelligent Systems and Technology, Vol. 29 Issue 1, pp. 63-92, 2008.
  3. C.-C. Chang and C.-J. Lin. “LIBSVM : a library for support vector machines”, ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011.
  4. Yu-Ren Chen and Hsin-Hsi Chen, “Opinion Spam Detection in Web Forum: A Real Case Study”, Proceedings of the 24th International Conference on World Wide Web, ACM, pp. 173-183, 2015.
  5. Wenjing Duan, Bin Gu and Andrew B. Whinston, “An empirical investigation of panel data”, Decision Support Systems, pp.1007-1016, 2008.
  6. R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification, Journal of Machine Learning Research 9 , pp. 1871-1874, 2008.
  7. Atefeh Heydari, Mohammad ali Tavakoli, Naomie Salim, Zahra Heydari, “Detection of review spam: A survey”, Expert Systems with Applications, Elsevier, Vol. 42 Issue 7 pp. 3634-3642, 2015.
  8. Chin-Lan Huang, Chung, C. K., Natalie Hui, Yi-Cheng Lin, Yi-Tai Seih, Ben C. P. Lam, Wei-Chuan Chen, Michael H. Bond, James W. Pennebaker, “The Development of the Chinese Linguistic Inquiry and Word Count Dictionary”, Chinese Journal of Psychology, pp. 185-201, 2012.
  9. C. L. Hung, C. K. Chung, N. Hui, Y. C. Lin, Y. T. Seih, W. C. Chen, B. Lam, M. Bond, and J. W. Pennebaker, “The Development of the Chinese Linguistic Inquiry and Word Count Dictionary”, Chinese Journal of Psychology, pp.54, 2012.
  10. M. McCord and M. Chuah, “Spam Detection on Twitter Using Traditional Classifiers”, Autonomic and Trusted Computing, Vol. 6906, pp. 175-186, 2011.
  11. Man-Chun Ko and Hsin-Hsi Chen, “Analysis of Cyber Army’ s Behaviours on Web Forum for Elect Campaign”, Information Retrieval Technology, Springer, Vol 9460 pp. 394-399, 2016.
  12. C.L. Lai, K.Q. Xu, Raymond Y.K. Lau and Y. Li, “Toward A Language Modeling Approach for Consumer Review Spam Detection”, IEEE International Conference on E-Business Engineering, pp. 1-8, 2010.
  13. Huayi Li, Zhiyuan Chen, Bing Liu, Xiaokai Wei and Jidong Shao, “Spotting Fake Reviews via Collective Positive-Unlabeled Learning”, Proceedings of IEEE International Conference on Data Mining, pp. 899-904, 2014.
  14. T Mikolov, I Sutskever, K Chen, GS Corrado, J Dean, “Distributed representations of words and phrases and their compositionality”, Advances in neural information processing systems, pp.3111-3119, 2013.
  15. Arjun Mukherjee, Bing Liu, Natalie Glance, “Spotting fake reviewer groups in consumer reviews”, Proceedings of the 21st international conference on World Wide Web, pp. 191-200, 2012.
  16. Arjun Mukherjee, Bing Liu, Junhui Wang, Natalie Glance, Nitin Jindal, “Detecting Group Review Spam”, Proceedings of the 20th international conference companion on World wide web, ACM, pp. 93-94, 2011
  17. Nikita Spririn, Jiawei Han, “Survey on Web Spam Detection: Principles and Algorithms”, ACM SIGKDD Explorations Newsletter, ACM, Vol. 13 Issue 2, pp. 50-64, 2011.
  18. Duyu Tang, Bing Qin, Ting Liu, “Learning Semantic Representations of Users and Products for Document Level Sentiment Classification”, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, pp. 1014-1023, 2015.
  19. Aldert Vrij, Ronald Fisher, Samantha Mann and Sharon Leal, “A cognitive load approach to lie detection”, Journal of Investigative Psychology and Offender Profiling, Wiley, Vol. 5 Issue 1-2, pp. 39-43, 2008.
  20. Chang Xu, Jie Zhang, Kuiyu Chang and Chong Long, “Uncovering collusive spammers in Chinese review websites”, Proceedings of the 22nd ACM international conference on Information & Knowledge Management, ACM, pp.979-988, 2013.
  21. Xianchao Zhang, Shaoping Zhu, Wenxin Liang, “Detecting Spam and Promoting Campaigns in the Twitter Social Network”, 12th International Conference on Data Mining, IEEE, pp. 1194-1199, 2012.
  22. Fabrício Benevenuto, Gabriel Magno, Tiago Rodrigues, Virgílio Almeida, “Detecting spammers on twitter”, Collaboration, Electronic messaging, Anti-Abuse and Spam Conference, CiteSeer, Vol. 6, pp. 12, 2010.
  23. Yu-Ren Chen and Hsin-Hsi Chen, “Opinion Spammer Detection in Web Forum”, Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 759-762, 2015.
  24. Zi Chu, Indra Widjaja, Haining Wang, “Detecting Social Spam Campaigns on Twitter”, Applied Cryptography and Network Security, Springer, Vol. 7341, pp. 455-472, 2012.
  25. Gordon V. Cormack, “Email Spam Filtering: A systematic Review”, Foundation and Trends in Information Retrieval, now publishers, Vol. 1, No. 4(2006) pp. 335-455, 2006
  26. Nitin Jindal and Bing Liu, “Opinion Spam and Analysis”, Proceedings of ACM the 2008 International Conference on Web Search and Data Mining, ACM, pp. 219-230, 2008.
  27. Haibo He and Edwardo A. Garcia, “Learning from Imbalanced Data”, IEEE Trans. on Knowl. and Data Eng., pp.1263-1284, 2009.
  28. Christopher G. Harris, “Detecting Deceptive Opinion Spam Using Human Computation”, Human Computation AAAI Technical Report WS-12-08, Association for the Advancement of Artificial Intelligence, pp. 87-93, 2012.
  29. Pedram Hayati, Vidyasagar Potdar, Alex Talevski, Nazanin Firoozeh, Saeed Sarenche, Elham A. Yeganeh, “Definition of Spam 2.0: New Spamming Boom”, 4th IEEE International Conference on Digital Ecosystems and Technologies, Dubai, pp. 580-584, 2010
  30. Huayi Li, Zhiyuan Chen, Arjun Mukherjee, Bing Liu and Jidong Shao, “Analyzing and Detecting Opinion Spam on a Large-scale Dataset via Temporal and Spatial Patterns”, Proceedings of the Ninth International AAAI Conference on Web and Social Media, Association for the Advancement of Artificial Intelligence, pp. 634-637, 2015.
  31. Ee-Peng Lim, Viet-An Nguyen, Nitin Jindal, Bing Liu, Hady W. Lauw, “Detecting Product Review Spammers using Rating Behaviors”, Proceedings of the 19th ACM international conference on Information and knowledge management, ACM, pp. 939-948, 2010.
  32. Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie Glance, “What Yelp Fake Review Filter Might Be Doing”, Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, Association for the Advancement of Artificial Intelligence, pp. 409-418, 2013.
  33. Ott, Myle and Choi, Yejin and Cardie, Claire and Hancock, Jeffrey T., “Finding Deceptive Opinion Spam by Any Stretch of the Imagination”, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Volume 1 pp. 209-319, 2011.
  34. James W. Pennebaker, Cindy K. Chung, Molly Ireland, Amy Gonzales, and Roger J. Booth, “The Development and Psychometric Properties of LIWC2007”, Austin, TX, LIWC.Net, 2007.
  35. Mahmudur Rahman, Bogdan Carbunar, Jaime Ballesteros, George Burri, Duen Horng (Polo) Chau, “Turning the Tide: Curbing Deceptive Yelp Behaviors”, Proceedings of the 2014 SIAM International Conference on Data Mining, pp. 9, 2014.
  36. Shebuti Rayana and Leman Akoglu. “Collective Opinion Spam Detection: Bridging Review Networks and Metadata”, Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 985-994, 2015.
  37. KC Santosh and Arjun Mukherjee, “On the Temporal Dynamics of Opinion Spamming: Case Studies on Yelp”, Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 369-379, 2016.
  38. Feng, Song and Banerjee, Ritwik and Choi, Yeji, “Syntactic Stylometry for Deception Detection”, Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2, Association for Computational Linguistics, 2012.