情感分析於電影推薦與評論展現系統之應用__國立政治大學博碩士論文全文影像系統

上傳須知

帳號：guest(3.144.227.52) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者(中):	黃德潔
作者(英):	Huang, Te-Chieh
論文名稱(中):	情感分析於電影推薦與評論展現系統之應用
論文名稱(英):	Application of Sentiment Analysis in Movie Recommendation and Comment-Revealing System
指導教授(中):	鄭宇庭
口試委員:	謝邦昌鄧家駒鄭宗記
學位類別:	碩士
校院名稱:	國立政治大學
系所名稱:	統計學系
出版年:	2020
畢業學年度:	108
語文別:	中文
論文頁數:	73
中文關鍵詞:	文字探勘、情緒分析、特徵分群、語意指向、機器學習
英文關鍵詞:	Text Mining、Sentiment Analysis、Feature Clustering、PMI、Semantic Orientation、Machine Learning
Doi Url:	http://doi.org/10.6814/NCCU202000667
相關次數:	推薦:0 點閱:132 評分: 下載:0 收藏:0

隨著以文字資訊為主的社交平台興起，例如：微博、推特、部落格…等微型網誌，消費者對於購買商品或服務品質的評價可以在網路世界中迅速傳播，對於其他消費者的購買意願造成很大的影響，同時也加深大眾對於該產品的品牌形象。對於電影產業更是如此，消費者只能透過片商剪輯的預告片，觀賞部分電影片段，就必須決定是否要進電影院觀賞，事後也沒有退換貨的服務，因此民眾在購買電影票之前，會更加注重網路上對於該部電影的相關評論以及心得分享。有鑑於此，如何從巨量的網路資訊當中，正確且有效率地辨別顧客所表達的語意與情緒，成為近年來文字探勘學者致力於探討的議題。
本論文實作出一個有效的電影評價系統，蒐集2019年Yahoo!奇摩電影網頁中網友滿意榜的短評資料，透過意見提取、屬性擷取、情緒分析、語意指向、特徵分群與機器學習分類法等技術，對評論按照其極性做分類，實驗結果顯示正確率為83.74%，F1-Measure也達84.29%，代表本研究在評論極性的判別上，確實有達到預期的效果。
最終評論呈現的方式有兩個特點，首先，評論會依據其情緒強度由大至小排序，讓使用者優先瀏覽情緒與內容最豐富的評論；再者，藉由呈現意見詞與屬性詞搭配的結果，提供使用者搜尋電影多面向的情緒分析成果，了解該電影在各個屬性類別的各自評價，藉此推薦合適的電影給消費者觀賞。

Following the rise of social media platforms for text information, such as Weibo, Twitter and Blog. Consumers’ rating for purchasable commodity and service quality can be rapidly spread in social media. It causes significant effect to other consumers’ desire to purchase. It also impresses the public about the product’s brand imagine. Furthermore, in movie industry, consumers have to decide whether to go into theater only through watching the segments of movie trailer. They can’t get a refund when they feel regrettable. So consumers will pay more attention on related comments and knowledge-sharing. For this reason, how to identify consumer’s expression of mood and semantization correctly becomes the subject for dedicated scholars.
This essay produces an efficient movie evaluation system. It collected netizen’s satisfactory list of comments from 2019 Yahoo movie web page. Through Feature Extraction, Attribute Capture, Sentiment Analysis, Semantic Orientation, Feature Clustering, Machine Learning Classification to classify comments in accord with polarity. This experiment proves that the accuracy reaching 83.74% and the F1-Measure reaching 84.29%. It means that this study has achieved its anticipative result in identifying the polarization of comments.
There are two characters appearing in final comments. First, comments will be listed in sequence according to sentiment intensity that let users browse the most abundant ones at first place. Secondly, by matching opinion keywords and feature keywords to offer users the outcome of multi-faceted analysis which could let them know the evaluation of each film’s attribute. Through it to recommend the suitable movie to consumers.

誌謝 I
摘要 II
Abstract III
目錄 V
圖目錄 VII
表目錄 VIII
一、緒論 1
1.1 研究背景與動機 1
1.2 研究目的 3
1.3 研究架構 4
二、相關文獻探討 5
2.1 屬性詞擷取 5
2.2 意見詞選取 6
2.3 意見詞分群方法 9
2.4 意見極性判斷 9
2.5 機器學習分類方法 12
三、研究方法 14
3.1 資料預處理 15
3.2 人工標註意見、屬性、否定詞與建立詞庫 17
3.3 意見詞分群 21
3.4 分類模型 25
四、實驗設計與結果 31
4.1 實驗資料 31
4.2 實驗設計 37
4.3 分類器評估標準 38
4.4 實驗結果 40
4.5 實驗結果錯誤分析 59
五、結論與建議 64
5.1 研究結論與貢獻 64
5.2 研究限制與後續發展方向 65
參考文獻 67

一、中文文獻
[1] 李淑惠，2014年，“運用文字探勘技術於口碑分析之研究”，東吳大學商學院資訊管理學系碩士論文。
[2] 邱鴻達，2011年，“意見探勘在中文電影評論之應用”，國立交通大學大資訊科學與工程研究所碩士論文。
[3] 俞舒褆，2018年，“應用情感分析於產品比較與品牌推薦系統-以美妝產平為利”，國立政治大學商學院統計學系碩士論文。
[4] 洪梓達，2019年，“應用特徵分群法進行情緒分析於中文電影評論之研究”，東吳大學商學院資訊管理學系碩士論文。
[5] 張莊平，2012年，“中文文法剖析應用於電影評論之意見情感分類”，國立台灣師範大學資訊工程研究所碩士論文。
[6] 張傳珩，2019年，“文本探勘與情緒分析於產品推薦之應用-以PTT電影版為例”，東吳大學商學院資訊管理學系碩士論文。
[7] 梅家駒等編著，1997年，“同義詞詞林”，臺灣東華書局股份有限公司。
[8] 陳克健、黃淑齡、施悅音、陳怡君，2004年，“多層次概念定義與複雜關係表達-繁體字知網的新增架構”，漢語詞彙語義研究的現狀與發展趨勢國際學術研討會，北京大學。
[9] 楊惠淳，2011年，“以主客觀分析與相互資訊檢索探討情感分析之準確度-以電影評論為例”，國立臺北科技大學資訊與運籌管理研究所碩士論文。
[10] 謝佩庭，2014年，“基於使用者情緒關鍵字彙之臉書粉絲專頁評論分類與評分系統”，國立交通大學多媒體工程研究所碩士論文。

二、英文文獻
[1] Agarwal, B. & N. Mittal, (2014), Semantic feature clustering for sentiment analysis of English reviews. IETE Journal of Research, 60(6), 414-422.
[2] Agarwal, B. & N. Mittal, (2016), Prominent feature extraction for sentiment analysis, Berlin: Springer International Publishing.
[3] Chen, J., H. Huang, S. Tian & Y. Qua , (2009), Feature selection for text classification with Naïve Bayes, Expert Systems with Applications, Vol. 36, No. 3, pp. 5432-5435.
[4] Church, K. W. & P. Hanks, (1990), Word association norms, mutual information, and lexicography. Computational Linguistics, 16(1), 22-29.
[5] Cortes, C. & V. Vapnik, (1995), "Support-Vector Networks", Machine Learning, Vol. 20, pp. 273-297.
[6] Dong, Z. & Q. Dong, (2006), HowNet and the Computation of Meaning. World Scientific.
[7] Galavotti, L., F. Sebastiani & M. Simi, (2000), Feature selection and negative evidence in automated text categorization, In Proceedings of KDD.
[8] Hu, M. & B. Liu, (2004), "Mining and summarizing customer reviews", KDD, pp.168-177.
[9] Karabatak, M. & M.C. Ince, (2009), A New Feature Selection Method Based on Association Rules for Diagnosis of Erythemato-squamous Diseases, Expert Systems with Applications, Vol. 36, No. 10, pp. 12500-12505.
[10] Ku, L. W., & H. H. Chen, (2007), Mining Opinions from the Web: Beyond Relevance Retrieval. Journal of American Society for Information Science and Technology, 58(12), 1838-1850.
[11] Li, B., S. Xu & J. Zhang, (2007), Enhancing Clustering Blog Documents by Utilizing Author/Reader Comments, Proceedings of the 45th Annual Southeast Regional Conference, pp. 94-99.
[12] Liu, B., M. Hu & J. Cheng, (2005), "Opinion Observer: Analyzing and Comparing Opinions on the Web", 14th international conference on World Wide Web(www), pp. 342–351.
[13] Manning, C. & H. Schutze, (1999). MITCogNet. Foundations of statistical natural language processing, Vol.59. MIT Press.
[14] Manning, C. D., P. Raghavan & H. Schütze, (2008), An Introduction to Information Retrieval. Cambridge University Press. ISBN 978-0-521-86571-5.
[15] Marneffe, M., C. D. Manning & C.Potts, (2010) ,"“Was it good? It was provocative.” Learning the meaning of scalar adjectives", 48th Annual Meeting of the Association for Computational Linguistics(ACL).
[16] Polat, K. & S. Gunes, (2009), A New Feature Selection Method on Classification of Medical Datasets: Kernel F-score Feature Selection, Expert Systems with Applications, Vol. 36, No. 7, pp. 10367-10373.
[17] Salton, G. & C. Buckley, (1988), Term Weighting Approaches in Automatic Text Retrieval. Information Processing and Management: An International Journal, 24(5), pp. 513-523.
[18] Simeon, M. & R. Hilderman, (2008), Categorical Proportional Difference: A Feature Selection Method for Text Categorization, Proceedings of the 17th Australasian Data Mining Conference, pp. 201-208.
[19] Tan, S. & J. Zhang, (2008), An empirical study of sentiment analysis for chinese documents, Expert Systems with Applications 34, pp. 2622–2629
[20] Tian, P., Y. Liu, M. Liu & S. Zhu, (2009), “Research of product ranking technology based on opinion mining,” Proceedings of the 2009 Second International Conference on Intelligent Computation Technology and Automation, Volume 4, pp. 239-243, 2009.
[21] Tian, X. & W. Tong, (2010), An Improvement to TF: Term Distribution Based Term Weight Algorithm, Proceedings of the second International Conference on Networks Security Wireless Communications and Trusted Computing (NSWCTC), pp. 252-255.
[22] Turney, P. D., (2002), Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July, p.417-424.
[23] Wang, T., H. Huang, S. Tian & J. Xu, (2010), Feature Selection for SVM via Optimization of Kernel Polarization with Gaussian ARD Kernels, Expert Systems with Applications, Vol.37, No. 9, pp. 6663-6668.
[24] Yang, Y. & J.O. Pedersen, (1997), A comparative study on feature selection in text categorization, ICML, pp. 412–420.
[25] Zhang, C., D. Zeng, J. Li, F. Y. Wang & W. Zuo, (2009) ,"Sentiment Analysis of Chinese Documents: From Sentence to Document Level", Journal of the American Society for Information Science and Technology, pp.2474-2487.
[26] Zhang, L., B. Liu, S. H. Lim & E. O’Brien-Strain, (2010), “Extracting and Ranking Product Features in Opinion Documents,” Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1462-1470, 2010.
[27] Zhuang, L., F. Jing & X. Y. Zhu, (2006), “Movie review mining and summarization,” Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, Arlington, Virginia, USA, 2006, pp.43-50.

三、網路資料
[1] CKIP，中央研究院中文斷詞系統，2011年，http://ckipsvr.iis.sinica.edu.tw/。
[2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.222.8905&rep=rep1&type=pdf.
[3] http://cpmarkchang.logdown.com/posts/195584-natural-language-processing-pointwise-mutual-information.
[4] http://journal.dyu.edu.tw/dyujo/document/setjournal/s3-1-9-18.pdf.
[5] http://oplab.im.ntu.edu.tw/csimweb/system/application/views/files/ICIM/20110026.
[6] http://pythonsparkhadoop.blogspot.com/2016/10/machine-learning.html.
[7] https://ir.nctu.edu.tw/bitstream/11536/50236/1/758401.pdf.
[8] https://medium.com/@chih.sheng.huang821/機器學習-kernel-函數-47c94095171.
[9] https://medium.com/@chih.sheng.huang821/機器學習-支撐向量機-support-vector-machine-svm-詳細推導-c320098a3d2e.
[10] https://medium.com/jameslearningnote/資料分析-機器學習-第3-4講-支援向量機-support-vector-machine-介紹-9c6c6925856b.
[11] https://medium.com/marketingdatascience/你了解你的消費者想-告訴-你什麼嗎-情感分析-sentiment-analytics-2f06fd52f10c.
[12] https://oosga.com/machine-learning/.
[13] https://www.aclweb.org/anthology/O12-3002.pdf.
[14] https://www.itread01.com/content/1541479756.html.
[15] https://www.ponews.net/technique/jwta8fmjrk.html.
[16] https://www.zhihu.com/question/273517852.
[17] https://wzwhit.github.io/2019/07/19/SVM2/.
[18] https://zh.wikipedia.org/wiki/Tf-idf.
[19] Yahoo 奇摩電影， https://movies.yahoo.com.tw/.
[20] 台灣大學情緒詞辭典 National Taiwan University Semantic Dictionary (NTUSD)，http://nlg18.csie.ntu.edu.tw:8080/opinion/pub1.html
[21] 知網 HowNet，http://www.keenage.com/.

(此全文20250630後開放瀏覽)
電子全文

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文