Using Text Mining to Classify the Documents on Social Media of Online Games

指導教授 : 葉清江 李慶長


近年來在全球市場遊戲營收上台灣穩定在前15名,而相關廠商最關心的莫過於玩家心聲,因此本文以社群媒體玩家交流的字詞為基準,進行文字探勘並分析,期許能讓相關廠商得知玩家心聲,喜歡哪一類型的文件。本文整理過去文獻,發現並未有相關文獻是採用文字探勘進行線上遊戲相關的社群媒體分析,因此本研究具有研究價值。本文以R撰寫爬蟲擷取2016年1~12月PTT LOL板的30931篇文件、WOW板的18464篇文件、Hearthstone板的8389篇文件並彙整成表格,再利用SAS 9.4進行文字分析,將LOL板關鍵的1164個字詞當中的349個分為4類、WOW板關鍵的2107個字詞當中的636個分為4類、Heathstone板關鍵的1183個字詞當中400的個分為5類,再計算每篇文件在這些歸類的歸類權數,以這些歸類權數對文件的推文數、噓文數進行迴歸分析,可清楚發現何種歸類對推文數、噓文數產生較大影響,最後利用關鍵字繪出文字雲呈現,再選擇各個歸類的某一個字詞為核心作出字詞網絡分析圖,供相關業者決策參考之依據。


Having Attained global top 15 online game revenue in recent years, Taiwan online game companies are undoubtedly concerned about the opinion of their users. This study scrutinizes the player comments and the preferred document types on the social media by text mining in expectation of a more accurate marketing programs. The significance of this study lies in the analysis of online game social media through text mining, which few previous studies have paid attention to. The study refers to the documents and comments at PTT LOL board, WOW board, and Hearthstone board. 30391 documents on PTT LOL board, 18464 on WOW board, and 8389 on Hearthstone board in 2016 are collected through the crawler coded by R, and then analyzed by SAS 9.4 Text Miner. To count the correlation between the phrases and the respective topics, the 349 categorizable phrases of LOL board are further classified into 4 topics, 2107 of WOW board into 4 topics, and 400 of Hearthstone board into 5 topics. The study then relates the document weights to the number of Likes and Dislikes at each board to perform regression analysis, and engenders the significant results for the two variables of each board to see the preference for difference topics. The wordcloud and word social network graph would also be presented at the end of this paper.


Text Mining Social Media Online Games SAS Text Miner PTT


施琮仁(2016)社交網站與公眾參與:[Pansci 泛科學臉書專業]使用者研究。傳播研究與實踐。6(2)。209-241。
