透過您的圖書館登入
IP:3.20.238.187
  • 學位論文

中文部落格文章之意見分析

Opinion Analysis of Chinese Blog Posts

指導教授 : 王正豪

摘要


如何在龐大的網路資料中,有效且快速地擷取出所要的資訊,已成為資訊檢索現階段重要的課題。意見分析為資訊檢索領域中的一部份,其主要目的在於分析資料中所包含的意見與評論。 BBS、論壇網站及部落格都包含大量使用者產生的意見或評價文章。然而,要一一閱讀這些大量的文章,必須耗費大量的時間與體力,同時還須要消化整理,才能有助於決策參考。 目前意見分析所採用的方法,可分為機器學習法與字典法兩種,雖然機器學習法也可顯示出效果,但卻缺乏語意方面的考量,因為人是透過閱讀並了解其語意之後,才能得到意見方面的訊息。而本論文所採用的字典法,是利用知網所發佈的「中文情感分析用詞語集」比對分析文章的程度詞與意見詞,並進一步計算出文章的整體評價。並以數位相機評論為例,搜尋相關的部落格文章,進行意見分析實驗。 根據實驗結果顯示,本論文所提出的方法在情感指向及強弱程度方面的分析,能與人工閱讀測試文章後給予的客觀評價有不錯的相似程度,代表此能適當擷取意見字詞並有效估計出綜合評價。

關鍵字

意見分析 知網 評價 部落格

並列摘要


In information retrieval domain, how to effectively and quickly extract useful information from a great deal of web data is important. Opinion analysis as a part of information retrieval, its main purpose is to analyze opinions and comments in the data. BBS, blogs and forums contain plenty of user-generated opinions or evaluations of posts. However, readers must spend a lot of time and efforts to digest and absorb a large number of posts to making decisions. The current methodology of opinion analysis can be divided into machine learning and lexicon-based methods. Before getting information about opinions, people have to read and understand their semantics. Although the machine learning method can show the same efficacy, it lacks semantic consideration. That’s why we use lexicon-based method. In this paper, “Chinese Vocabulary for Sentiment Analysis” issued by HowNet is used as the major lexicon. First, opinion terms and degree terms are extracted from blog posts by exact matching with the lexicon. Then, a score is assigned to each opinion term and degree term to reflect its opinion strength. Finally, the total score for the blog post is calculated as the average of opinion term scores in the post. In the experiment, we evaluated our method by blog posts on digital cameras. From the results, the proposed method has high similarity with the manual evaluations of the test posts in terms of polarity and strength of opinions. This shows effective extraction of opinion terms and calculation of evaluations. Further investigation is needed to check the effectiveness of parsing sentence structures.

並列關鍵字

opinion analysis HowNet blogs

參考文獻


[6] 李孟潔,利用機器學習作法之中文意見分析,碩士,國立清華大學,新竹,2009。
[15] Epinions, available at : http://www.epinions.com/ (viewed on 2010/4/19)
[1] Gilad Mishne, Using Blog Properties to Improve Retrieval, In proceedings of International Conference on Weblogs and Social Media (ICWSM), Japan, 2007.
[2] I. Ounis, M. de Rijke, C. Macdonald, G. Mishne, I. Soboroff, Overview of TREC-2006 Blog track, In proceedings of TREC, U.S.A., 2007.
[3] 網誌, http://zh.wikipedia.org/zh-tw/網誌, 2010/6/10

被引用紀錄


蔡怡宣(2017)。以社群輿論管制圖實施公關危機監控〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2017.00819
侯以恆(2016)。基於語音辨識技術之電子商務輿情推薦系統之研究〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2016.00987
林祐任(2015)。未來性資訊檢索系統基於網路論壇之研究〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2015.01035
陳柏翰(2013)。基於中文語法規則的意見單元抽取方法之研究〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2013.00140
簡之文(2012)。部落格文章情感分析之研究〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2012.00693

延伸閱讀