A method of spam detection based on structural similarity

Li, Chungta

透過您的圖書館登入 IP:13.59.231.155

透過您的圖書館登入

IP:13.59.231.155

繁體中文
English
简体中文

精確檢索 : 冠狀病毒
模糊檢索 : 冠狀病毒
冠狀病毒感染

冠狀病毒疾病
查詢出版品: 冠狀病毒

進階查詢

查詢歷史

主題瀏覽

【下載完整報告】AI熱潮從學術研究也能看出端倪？哪些議題是2023熱搜議題？

學位論文

A method of spam detection based on structural similarity

黎俊達(Li, Chungta)

指導教授：林柏青

國立中正大學/工學院/資訊工程學系/碩士(2012年)

若您是本文的作者，可授權文章由華藝線上圖書館中協助推廣。

未授權

摘要

無資料

關鍵字

垃圾郵件；分群；文件相似度

並列摘要

Spammers usually deliver a large number of spam instances generated from a set of templates. To identify spam messages in the same campaigns or to detect new spam instances that are likely to belong to known campaigns, we propose a method to group spam messages based on their HTML struc- tural features. We observe that spam mails tend to have similar structures of the mail bodies, even though the words in the bodies can be signicantly dif- ferent to evade spam detection. Rather than infer the templates and represent them in regular expressions, we extract the HTML tags from the mail bodies as the structural features, and build a ngerprint for each structure. With the ngerprints, we can eciently identify the clusters of similar structures using the simhash algorithm and the Jaccard similarity. The identication is useful to nd new spam instances belonging to known structures with a high recall up to around 95%, while the false-positive rates for normal mails can be less than 5%.

並列關鍵字

Spam ； Clustering ； Document similarity

參考文獻

and S. Savage, Spamalytics: an empirical analysis of spam marketing

worm-making-millions-day, Feb. 2008.

Apr. 2008.

[6] Andreas Pitsillidis, Kirill Levchenko, Christian Kreibich, Chris Kanich,

Georey M. Voelker, Vern Paxson, Nicholas Weaver, and Stefan Savage,

國際替代計量

A method of spam detection based on structural similarity

未授權

主題瀏覽

A method of spam detection based on structural similarity

A method of spam detection based on structural similarity

摘要

關鍵字

並列摘要

並列關鍵字

參考文獻

國際替代計量

本網站使用Cookies