A study of effective features for detecting long-surviving Twitter spam accounts

As social networking websites get popular in recent years, the number of spammers on them also increase rapidly. Spammers can annoy or harm the users by posting malicious links, commercial spam messages, and pornographic messages. A number of features to detect spam accounts have been proposed in prior studies. In this work, we study the common features to see whether they are effective to detect Twitter spam accounts or not. We use the Twitter API to collect 26,758 public accounts with 508,403 tweets intermittently over the period from September 2011 to March 2012. Although Twitter has its own algorithm to detect and suspend spam accounts, there are still many long-surviving spam accounts. We select 816 long-surviving spam accounts to persistently observe their activities, and record their lifespans in days. Moreover, we observe that some features are not so effective for the detection, and select two features, the URL rate and the interaction rate, to be the detection features in this work. With the C4.5 algorithm, the precision of the detection is between 0.82897 and 0.88503, and the recall is between 0.98727 and 0.99873. We also present several additional techniques to further improve the detection accuracy.

並列關鍵字

Twitter ； spam accounts

參考文獻

[2] S. Abu-Nimeh, T. Chen, and O. Alzubi, “Malicious and spam posts in online social networks,” IEEE Computer Society, 44(9), Sep. 2011.

[4] G. Yan, G. Chen, S. Eidenbenz, and N. Li, “Malware propagation in online social networks: nature, dynamics, and defense implication,” In

[8] C. Yang, R. C.Harkreader, and G. GU, “Die free or live hard? Empirical evaluation and new design for ﬁghting evolving Twitter spammers,” In Proceedings of the 14th International Symposium on Recent Advances in Intrusion Detection(RAID), Sep. 2011.

[9] J. Song, S. Lee, and J. Kim, “Spam Filtering in Twitter using Sender-Receiver Relationship,” In Proceedings of International Symposium on

Recent Advances in Intrusion Detection (RAID), Sep. 2011.

國際替代計量

A study of effective features for detecting long-surviving Twitter spam accounts

未授權

主題瀏覽