透過您的圖書館登入
IP:18.224.44.108
  • 學位論文

應用文字探勘系統於領域主題式社群意見分析

A Text Mining System for Topic-Base Public Opinion Analysis

指導教授 : 林宣華

摘要


本論文整合文字分析與資料探勘技術,開發社群輿情分析系統。透過抓取粉絲團、論壇中用戶的評論與留言,分析使用者建議、情緒、與背景等。從大量留言內容分析並過濾資料 (Data Cleaning),留下有意義的留言,透過字典比對方式,擷取關鍵字,分析詞彙否定詞、肯定詞情緒用語屬性,以建立情緒樹,並紀錄每個節點路徑和頻率,透過情緒路徑分析,推導新留言的正面和負面機率。最後,以主題計算領域關鍵字之關聯度,並分析主題重要屬性面向,各個屬性面向所得到正負面的回饋,以圖表方式掌握時段區間內的主題相關輿情資訊。

並列摘要


This thesis proposed a community public opinion analysis system which integrates text analysis and data mining methods. Our purpose is to estimate the users’ emotions from their comments of FB fan pages and forums. First, the system crawls the users’ comments from Facebook Pages, Mobile01 and PTT. Then, it eliminates the redundant segments to filter the advertising and useless comments so that only the meaningful comments will preserve. From the rest of these comments, the system will extract domain keywords and define the attributes which contains 5W (who, why, when, where, what) and 4S (positive, negative, neutral, ridicule) to construct a sentiment decision tree. Then, implement FP-Growth algorithm to record the path and frequency of each node in decision tree so that the system can calculate the probabilities of positive and negative sentiment. In conclusion, by targeting a domain our system can obtain the topic phase and its sentimental comments, and perform these result with hypertree infographic.

參考文獻


[1] Jieba
https://github.com/fxsjy/jieba
[2] Media Wiki
https://www.mediawiki.org/wiki/MediaWiki
[3] 國家教育研究雙語詞彙

延伸閱讀