透過您的圖書館登入
IP:3.138.175.180
  • 學位論文

一個植基於郵件標頭分析的垃圾郵件過濾器

The Design of an E-mail Header Based Spam Filter

指導教授 : 陸承志

摘要


本研究探討了現有的垃圾郵件過濾技術之優缺點,設計一個結合啟發式過濾與貝氏分類器之優點,並只針對郵件標頭進行分析過濾的垃圾郵件過濾器。本方法從郵件標頭的欄位中截取出特徵樣式,輸入進貝氏分類器進行訓練,再從訓練的結果測試其他郵件。 研究結果顯示,本研究在過濾效能上,過濾速度與SpamAssasin, SpamBayes, Bogofilter等開放軟體的差異相當小;而面對中文、英文以及中英文混合郵件時的過濾結果比起其他軟體來的穩定。整體來說,本研究的垃圾郵件過濾準確率可以維持在88%以上,而正常信誤判率則維持在0.1%以下。

並列摘要


This study reviews several popular spam filters and filtering approaches, and then propose and E-mail header based spam filter that takes advantages of both heuristic filtering and Bayesian filter. The experiment results showed that the proposed spam filter performs more stably than SpamAssassin, Bogofilter and SpamBayes. The proposed spam filter achieved an average precision rate above 88% and a false positive rate below 0.1%. Also the proposed spam filter is comparable to other spam filter in filtering efficiency.

參考文獻


1. William W. Cohen, “Enron Email Dataset”, http://www.cs.cmu.edu/~enron/, 2005
2. Freed & Borenstein, “RFC2045 Multipurpose Internet Mail Extensions (Part One),” http://www.ietf.org/rfc/rfc2045.txt?number=2045, 1996
8. C. D. Manning, H. Schutze, “Foundations of statistical natural language processing,” MIT press, pp.315-407, 1999
12. L. R., “Measure of the Amount of Ecologic Association between Species,” Journal of Ecolog, 26, 1945, pp. 297-302.
16. Gary Robinson, “Why Chi? Version 0.93,” http://www.garyrobinson.net/2004/05/why_chi.html, 2004

被引用紀錄


蘇怡鳳(2015)。以空間分析決定結核病主動篩檢高危險群之政策分析:以花東地區為例〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2015.01943

延伸閱讀