簡易檢索 / 詳目顯示

研究生: 陳崇儒
Chen, Choung-Ru
論文名稱: 新聞文件中意見句自動擷取及意見持有者辨識之研究
Automatically Extracting Opinion Sentences and Identifying Opinion Holders in News
指導教授: 侯文娟
Hou, Wen-Juan
學位類別: 碩士
Master
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 51
中文關鍵詞: 意見探勘意見句擷取意見持有者辨識機器學習監督式學習
英文關鍵詞: opinion exploration, opinion extraction, opinion holder identification, machine learning, supervised learning
DOI URL: http://doi.org/10.6345/THE.NTNU.DCSIE.002.2018.B02
論文種類: 學術論文
相關次數: 點閱:86下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 網路的發達,帶給人們便利。但每天都有大量的文本資訊需要閱讀,這時便可利用意見探勘擷取文本中人們感興趣之部分。而通常人們對文章會感興趣的部分都是誰發表什麼意見或是誰提出什麼看法,而這些描述的句子在文章中便稱為意見句。本研究提出監督式之機器學習方法,首先找出文章的意見句,再辨識意見句中的文章作者意見以及意見持有者。
    利用自然語言處理之方法辨識文章作者以及意見持有者,其中方法包括Tokenization、蒐集意見詞、Stemming、尋找意見句、詞性標記、具名實體辨識和文章作者以及意見持有者之特徵擷取。而在特徵擷取部分,本論文利用詞彙相關資訊、詞性相關資訊、標點符號相關資訊、具名實體相關資訊、句法相關資訊、意見詞資訊以及文句組成相關資訊等特徵辨識文章中意見句之文章作者意見以及意見持有者。
    實驗成果顯示在英語新聞文章中,文章作者意見辨識可以達到F-1值69.05%的效能;意見持有者辨識可以達到F-1值72.06%的效能。
    關鍵字:意見探勘、意見句擷取、意見持有者辨識、機器學習、監督式學習

    Network of development gives people some convenience. However, there is a great deal of textual information that we need to read every day, so that we can utilize the opinion exploration to capture the part of the text we are interested in. Usually, people interested in who made comments or opinions in the article, and which are called opinion holders. This study proposes a supervised machine learning method. First we find the opinion of the article, and then identify the author of the article in the opinion and the holder of the opinion.
    The method of natural language processing is used to identify the author of the article as well as the opinion holder, in which the method includes tokenization, collecting opinion words, stemming, finding opinion, part-of-speech tagging, recognizing the named entity and the author of the article and the feature extraction. In the feature extraction section, thesis dissertation uses the features of lexical related information, part of speech related information, punctuation related information, named entity related information, syntactic related information, opinion word information and sentence information to identify the article's opinion sentences, author's opinions and opinions holder.
    The experimental results show that, the article author's opinion recognition can achieve 69.05% of the F-1 value and the opinion holder extraction can get 72.06% of the F-1 value.
    Keywords: opinion exploration, opinion extraction, opinion holder identification, machine learning, supervised learning

    摘要 I Abstract II 謹獻給 IV 誌謝 V 圖目錄 VIII 表目錄 IX 第一章 緒論 1 第一節 研究動機 1 第二節 研究目的 2 第三節 論文架構 3 第二章 相關研究探討 4 第一節 意見探勘及意見持有者辨識之相關研究 4 第二節 意見持有者辨識相關研究所使用機器學習之方法 6 (一) 支持向量機(Support Vector Machine, SVM) 6 第三節 相關工具介紹 6 (一) Stanford Core NLP Toolkit 6 (二) Porter Stemming 9 第三章 研究方法 12 第一節 辨識流程 12 第二節 前置處理程序 15 (一) Tokenize 16 (二) 蒐集意見詞及尋找意見句 16 (三) Stemming Algorithm 17 (四) 詞性標記(Part of Speech Tagging) 18 (五) 具名實體辨識(Named Entity Recognition) 26 (六) 特徵值擷取(Feature Extraction) 28 第三節 文章作者意見辨識 29 (一) 詞彙相關資訊 30 (二) 詞性相關資訊 32 (三) 標點符號相關資訊 33 (四) 具名實體相關資訊 34 (五) 句法相關資訊 34 (六) 意見詞資訊 35 第四節 意見持有者辨識 37 (一) 詞性相關資訊 38 (二) 具名實體相關資訊 39 (三) 文句組成相關資訊 39 第四章 實驗與結果 41 第一節 實驗語料 41 第二節 文章作者意見辨識實驗 42 第三節 意見持有者辨識實驗 44 第五章 結論與未來展望 47 參考文獻 48

    [1] Chang, C. C., & Lin, C. J. (2011). "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology (TIST), 2(3), 27.

    [2] Choi, Y., Cardie, C., Riloff, E., & Patwardhan, S. (2005, October). "Identifying sources of opinions with conditional random fields and extraction patterns." In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing (pp. 355-362).

    [3] Cortes, C., & Vapnik, V. (1995). "Support-vector networks." Machine learning, 20(3), 273-297.

    [4] EBS美樂顧問中心-英語實力養成、專業能力發揮: http://ebseducation. pixnet.net/blog/post/406370224%E3%80%90%E5%95%86%E7%94%A8email%E3%80%919%E5%80%8B%E5%B8%B8%E7%94%A8%E7%9A%84%E5%8B%95%E8%A9%9E,%E6%9C%89%E6%95%88%E7%8E%87%E7%A2%BA%E5%AF%A6%E8%A1%A8%E9%81%94acknowledge

    [5] Exam English Home Page: https://www.examenglish.com/vocabulary/b1_perso nal_feelings.htm

    [6] Kim, S. M., & Hovy, E. (2004, August). "Determining the sentiment of opinions." In Proceedings of the 20th international conference on Computational Linguistics (p. 1367).

    [7] Lafferty, J., McCallum, A., & Pereira, F. C. (2001). "Conditional random fields: Probabilistic models for segmenting and labeling sequence data." In Proceedings of the eighteenth international conference on machine learning, ICML (Vol. 1, pp. 282-289).

    [8] Macmillan Dictionary: http://www.macmillandictionary.com/thesaurus- category/
    british/to-give-your-opinion

    [9] Manning, C. D., Surdeanu, M., Bauer, J., Finkel, J. R., Bethard, S., & McClosky, D. (2014, June). "The Stanford CoreNLP natural language processing toolkit." In ACL (System Demonstrations) (pp. 55-60)

    [10] Michael W., Marc S., & Josef R. (2015, September). "Opinion Holder and Target Extraction for Verb-based Opinion Predicates – The Problem is Not Solved" In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2015), pages 148–155, Lisboa, Portugal

    [11] OpinionFinder: MPQA OpinionFinder: http://mpqa.cs.pitt.edu/opinionfinder/

    [12] Opinion Lexicon: https://github.com/jeffreybreen/twitter-sentiment-analysis-tutir ial-201107/tree/master/data/opinion-lexicon-English

    [13] Soo, M. K., & Eduard, H. (2006). "Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text"

    [14] Stanford Log-linear Part-of-Speech Tagger標記涵意: http://www.cnb logs.com/r
    obert-dlut/p/4034297.html

    [15] Steven, B., Hong, Y., Vasileios, H., & Dan, J. (2004, January). "Automatic Extraction of Opinion Propositions and their Holders." In Proceedings of the AAAI spring symposium on exploring attitude and affect in text: theories and applications

    [16] Poter’s Stemmer :https://tartarus.org/martin/PorterStemmer/

    [17] Thesaurus.com: http://www.thesaurus.com/browse/give%20opinion/

    [18] Zhang, Y. H. (2006, July). "Combining the Supervised and Unsupervised Approaches to Identifying Opinion Holders in News."
    [19] 李佳穎、古倫維和陳信希,2009,"意見持有者辨識及其意見立場分析",國立台灣大學資訊工程所碩士論文。

    [20] 台灣主流觀點(Taiwan News), from http://taiwannews.com.tw/etn/index_en.php/

    下載圖示
    QR CODE