透過您的圖書館登入
IP:216.73.216.225
  • 學位論文

基於群眾意見關鍵詞向量之社群輿論分析研究

A Public Opinion Keyword Vector for Social Sentiment Analysis Research

指導教授 : 張詠淳

摘要


在這個網路普及的時代,網際網路已經成為人們分享與取得知識的重要來源,且隨著社群多媒體的蓬勃發展,任何網際網路使用者都能夠輕鬆地對一事件發表意見與看法,這種便利性雖然讓網際網路成為了解事件的重要知識寶庫,但超載的事件資訊卻也加重了使用者了解事件的負擔。有鑑於此,本研究基於文字探勘技術分析社群多媒體中的群眾意見,有別於以往的研究,本研究是從分析短文本(short text)主題對讀者產生的情緒,進而彙整成群眾之輿論與觀感。本研究中,我們提出了群眾意見關鍵詞向量(Public Opinion Keyword Embeddings, POKE)用以表達每一則來自社群多媒體的短文本,並且與多項單純貝氏分類器(Multinomial Naive Bayes classifier)、決策樹(Decision tree)、邏輯斯迴歸(Logistic regression)以及社群短文字分類常用方法LibShortText做比較。從實驗結果顯示,本研究方法POKE在整體的效能評估中皆獲得最好的效能表現,即代表本研究方法能有效地表達短文本群眾意見的意涵,並結合視覺化分析方法進而更深入瞭解社群多媒體中的群眾意見。

並列摘要


In the Internet era, online platforms are the most convenient means for people to share and retrieve knowledge. Social media enables users to easily post their opinions and perspectives regarding certain issues. Although this convenience lets the internet become a treasury of information, the overload also prevents user from understanding the entirety of various events. This research aims at using text mining techniques to explore public opinion contained in social media by analyzing the reader’s emotion towards pieces of short text. We propose Public Opinion Keyword Embedding (POKE) for the presentation of short texts from social media, and a vector space classifier for the categorization of opinions. We compared with Multinomial Naive Bayes classifier, Decision tree, Logistic regression and the common method which used for social media short text classification: LibShortText. The experimental results demonstrate that our method obtain the best performance in overall representation, it means that our method can effectively represent the semantics of short text public opinion. In addition, we combine a visualized analysis method for keywords that can provide a deeper understanding of opinions expressed on social media topics.

參考文獻


來斯惟 (2016)。基于神經網絡的詞和文檔語意向量表示方法研究。中科院自動化所(博士論文) 。取自https://arxiv.org/pdf/1611.05962.pdf
黃昌寧& 趙海 (2007)。中文分詞十年回顧。中文信息學報, 21(3), 8-19。取自http://jcip.cipsc.org.cn/CN/abstract/abstract759.shtml
Blackshaw, P. (2004). Consumer-generated media (CGM) 101: Word-of-mouth in the age of the web-fortified consumer. http://www.nielsen- online.com/downloads/us/buzz/nbzm_wp_CGM101. pdf.
Blackshaw, P. (2006). The consumer-generated surveillance culture. Retrieved August, 17, 2011.
Bunescu, R. C. Machine Learning CS 4900/5900. Lecture 08-1Retrieved from: https://pdfs.semanticscholar.org/presentation/a283/a2c4b4544b0731ea022dcf679647236ea61d.pdf

延伸閱讀