透過您的圖書館登入
IP:3.149.251.155
  • 學位論文

Automatic identification of hot topics and user clusters from online discussion forums

Automatic identification of hot topics and user clusters from online discussion forums

若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

並列摘要


With the advancement of Internet technology and the changes in the mode of communications, it is found that much first-hand news have been discussed in Internet forums well before they are reported in traditional mass media. Also, this communication channel provides an effective channel for illegal activities such as dissemination of copyrighted movies, threatening messages and online gambling etc. The law enforcement agencies are looking for solutions to monitor these discussion forums for possible criminal activities and download suspected postings as evidence for investigation. The volume of postings is huge, for 10 popular forums in Hong Kong; we found that there are 300,000 new messages every day. In this thesis, we propose an automatic system that tackles this problem. Our proposed system downloads postings from selected discussion forums continuously and employs data mining techniques to identify hot topics and cluster authors into different groups using word based user profiles. Using these data, we try to locate some useful trends and detect crime from the data, the result is discussed afterward with include advantages and limitations of different approaches and at the end, there is a conclusion of the way to solve those problems and provide future direction of this research.

並列關鍵字

Data mining Cluster analysis