隨著網路科技的進步,社群網路媒體已廣為大眾使用。人們在社群網路,如:facebook、twitter等,發表自己的言論。這些言論可以反映出用戶們的許多資訊,例如:喜歡的事物、理念傾向等。我們亦可運用這些資訊將用戶們分群後,以利後續的研究分析或獲取商業利益。在本篇中,我們藉由蒐集用戶在社群網路中所發的文章並運用主題模型來進行用戶們的分析,找出各用戶常用的主題字彙後,再使用集群分析,如:k-means、affinity propagation等方法將相似的用戶們進行分群。我們也探討加入時間後,在各個時間區間下,觀察用戶們主題以及分群的變化。最後,本篇也使用了PTT的資料,呈現出中文的文章在運用此方法下,用戶分群的效果以及發現。
With the advancement of network technology, social media has been widely used by the public. People express their opinions on social networks such as facebook or twitter. These remarks can reflect a lot of information about users, such as favorite things, ideas or tendencies. We can use these information to group users for facilitating subsequent research analysis or gaining business benefits. In this article, we collect the documents sent by users in the social network and using the topic model to find out which topics commonly used by each user. After finding the topic distribution for each user, we can cluster them by using some clustering analysis methods such as k- means, affinity propagation, etc. We also consider the time effect and explore the changes in the user's topic and clustering in each time slice. Finally, We also uses the PTT data, showing the effect of the user clustering and some discovery under the Chinese documents.