透過您的圖書館登入
IP:18.226.181.89
  • 學位論文

以N-gram為基礎之網路新聞讀者情緒預測方法

Prediction of News readers’ Emotion by N-gram

指導教授 : 張昭憲

摘要


隨著社群網路的興起,群眾開始習慣在網路上發表意見,並進行評論。使用者在網路的活動,留下了大量的公開資料,若能仔細加以加析,便可獲得寶貴的訊息,了解民眾的喜好與需求。由於具有高度實用性,產、官、學各界紛紛投入網路與情探勘(Public Opinion Mining)的行列中。本研究以網路新聞讀者情感預測為目標,希望能了解讀者對於剛刊登新聞之可能反應,以做為當局發布新聞、制定決策時之重要參考。為此,本研究長時間大量蒐集網路新聞,使用N-gram技術對於網路新聞進行斷詞,對於常用字詞進行次數統計,並配合讀者的情緒投票,產生新聞與讀者情感之預測模型。對待測新聞進行預測時,本研究亦嘗試各種不同的相似度計算方法,以提升準確率。本研究蒐集2013年12月8日至2014年11月12日止,共193,489筆新聞進行實驗,結果顯示本研究提出之方法在特定新聞類別中具有良好準確率。此外,我們也發現新聞蒐集時間增長時,預測準確率更可獲得明顯提升。其次,當有重大新聞發生時,延後塑模的時間點可獲得更佳的預測結果。

並列摘要


With the rise of community networks, people began to get used to show their opinion and comment. Network users leaving a large number of publicly available data by their activity. We can extract data to useful and precious information by analysis data carefully to understanding the requirements and preferences of people. Due to highly practicable of emotion analysis, filed , academic and government join the research of public opinion mining. This study will focus on prediction of news readers’ emotion. Government or companies can make decision by referring to emotion of news readers. Collecting large internet news long time and make word segmentation by N-gram on every news. Statistic frequency of key word and create emotion model by news readers’ emotion voting. When predict readers’ emotion of news, this study try to use three method to improve accuracy rate. This study collect internet news from December 8 2013 to November 12 2014, total 193,489 news. This study present high accuracy in some specific category of news. In this study, accuracy rate will improve apparently with news collection time. When grave news occurred, postpone the model timestamp will get better accuracy rate.

參考文獻


4. 王正豪、葉庭瑋,2013,基於意見詞修飾關係之微網誌情感分析技術。臺北科技大學資訊工程系研究所學位論文。
5. 周家宇,2012,基於餘弦和模糊相似度方法之漸進式企業電子郵件分類。中央大學資訊工程學系學位論文。
6. 陳立,2010,中文情感語意自動分類之研究。臺灣師範大學資訊工程研究所學位論文。
1. Bo Pang and Lillian Lee, 2008, “Opinion Mining and Sentiment Analysis”, Foundations and Trends in Information Retrieval Vol. 2, No 1-2 (2008) , 1–135.
3. Thorsten Jochins, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features”, Lecture Notes in Computer Science Volume 1398, 1998, 137-142.

延伸閱讀