透過您的圖書館登入
IP:13.58.39.23
  • 學位論文

SociRank : 基於社群媒體影響力之新聞重要性排序

SociRank : Ranking prevalent topics using social media factors

指導教授 : 陳宜欣
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


Historically, information which apprises us of daily events has been provided by mass media sources, specifically news media. Presently social media services, such as Twitter, provide an enormous amount of user generated data which has great potential to contain informative news related content. However, for this content to be useful we must find a way to filter noise and capture only such information that, based on its content similarity to news media, may potentially be considered useful or valuable. However, even after noise is removed there still exists a problem of information overload in the remaining data. A person is incapable of processing huge amounts of information all at once and thus information which is of most value must be prioritized for consumption. To achieve prioritization, the information must be ranked in order of estimated importance. The temporal prevalence of a particular topic in news media is one significant factor of importance and may be considered the media focus of a topic. The topic’s temporal prevalence in social media, specifically Twitter, indicates user interest and may be considered its user attention. Furthermore, the interaction between the social media users whom mention this topic indicates the strength of the community discussing said topic and may be considered the user interaction. We propose an unsupervised method called SociRank, which identifies news topics that are prevalent in both social and news media and then ranks these topics taking into account media focus, user attention and user interaction as measures of importance.

並列摘要


Historically, information which apprises us of daily events has been provided by mass media sources, specifically news media. Presently social media services, such as Twitter, provide an enormous amount of user generated data which has great potential to contain informative news related content. However, for this content to be useful we must find a way to filter noise and capture only such information that, based on its content similarity to news media, may potentially be considered useful or valuable. However, even after noise is removed there still exists a problem of information overload in the remaining data. A person is incapable of processing huge amounts of information all at once and thus information which is of most value must be prioritized for consumption. To achieve prioritization, the information must be ranked in order of estimated importance. The temporal prevalence of a particular topic in news media is one significant factor of importance and may be considered the media focus of a topic. The topic’s temporal prevalence in social media, specifically Twitter, indicates user interest and may be considered its user attention. Furthermore, the interaction between the social media users whom mention this topic indicates the strength of the community discussing said topic and may be considered the user interaction. We propose an unsupervised method called SociRank, which identifies news topics that are prevalent in both social and news media and then ranks these topics taking into account media focus, user attention and user interaction as measures of importance.

參考文獻


[12] Canhui Wang, Min Zhang, Liyun Ru, and Shaoping Ma. Automatic online news topic ranking using media focus and user attention based on aging theory. In Proceedings of the 17th ACM conference on Information and knowledge management, pages 1033–1042. ACM, 2008.
[13] Elizabeth Kwan, Pei-Ling Hsu, Jheng-He Liang, and Yi-Shin Chen. Event identification for social streams using keyword-based evolving graph sequences. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, pages 450–457. ACM, 2013.
[11] Chien Chin Chen, Yao-Tsung Chen, Yeali Sun, and Meng Chang Chen. Life cycle modeling of news events using aging theory. In Machine Learning: ECML 2003, pages 47–59. Springer, 2003.
[14] Kevin Gimpel, Nathan Schneider, Brendan O’Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A Smith. Part-of-speech tagging for twitter: Annotation, features, and experiments. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2, pages 42–47. Association for Computational Linguistics, 2011.
[4] Hsin-Hsi Chen, Ming-Shun Lin, and Yu-Chuan Wei. Novel association measures using web search with double checking. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 1009–1016. Association for Computational Linguistics, 2006.

延伸閱讀