透過您的圖書館登入
IP:18.222.200.143
  • 學位論文

以類免疫系統法建置垃圾郵件過濾系統之研究

Apply the Artificial Immune System Approach to Develop a Spam Filtering Prototype System

指導教授 : 皮世明
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


電子商務蓬勃發展與網際網路使用之普及,藉由電子郵件作為商務往來之聯繫方式,已日漸普遍,大量垃圾郵件之濫發,易造成網路容量之壅塞,電子郵件服務提供者必須耗費龐大人力、物力處理,除將妨礙正常之通信服務外,亦已嚴重影響社會大眾網路使用的環境。 許多研究提出以各種啟發式分類技術來達成垃圾郵件自動過濾,能自動判斷出垃圾郵件的各種變化,然後加以過濾。但為了避開垃圾郵件過濾軟體的偵測及過濾,目前的垃圾郵件已朝向內容稀少、圖片化、減少連結url以避開追查、間歇性少量發送等模式發展,所以未來垃圾郵件的特徵會變得越來越近似正常電子郵件。而目前的分類技術必須要修改並重新判斷垃圾郵件的特徵,持續更新才能有效過濾郵件。 本研究提出以類免疫系統法為核心,建制垃圾郵件過濾系統。類免疫系統是具適應性的系統,將理論的免疫學和觀察到的免疫功能、手法和模型應用在問題解決上。將垃圾郵件特徵視為抗原,並將防疫垃圾郵件的規則視為抗體,當抗原與其對應之抗體可結合時,就代表垃圾郵件被已辨識。仿製人體免疫株落選擇的過程,將被激發且可辨識抗原的抗體予以複製與突變,產生新的抗體,可以適應那些為了避開垃圾郵件過濾軟體偵測及過濾而變種的新型垃圾郵件。 本研究的結果顯示,綜合評比垃圾郵件攔截率及正常郵件誤攔率,本雛型系統在公開資料集與日常郵件的測試上,比現有垃圾郵件過濾軟體,更能達到較好的垃圾郵件攔截率及較低的正常郵件誤攔率。本研究發現以類免疫基因演算為核心架構之類免疫系統,能彌補過去基因演算法系統機制只能針對單純特徵組合求解的缺陷,以此產生「適應性」的情況下在面對新模式郵件的分類品質,有較顯著之適應性。

並列摘要


Because of the popularity of e-commerce and the vigorous growth of the internet networking, email has became the most common way of communication of commercial exchange gradually. This utility has already been abused gradually at present. Receivers have to spend large of time to deal with a large number of commercial emails. Not only cause the waste of time, but also hinder the receiving of the important mail. Due to the bulk sending of commercial emails, the email service provider must consume huge strength and resources to deal with the jams of the internetworking. Besides hindering from the normal service of network connecting, it has also already damaged the using of the public network environment seriously. Previous researches proposed many kinds of the heuristic classification approach for spam filtering. Far more beyond the circumstances in order to avoid from the detecting and filtering of antispam utilities, spammer have already evolved their model into content rareness, graphicalizing, reducing the link of url, interval sending etc. To react these circumstances, those classification approaches still need to modify and update contiguously. Our research proposes a spam filtering prototype system which applying the artificial immune system approach. The design of Artificial Immune system is to imitate the process of the human immune system. According to the human immunity, AIS has the abilities of recognition and adaptation. When pathogens invade the human immunity, macrophages as the vanguard in the frontline compose pathogens into antigens. Consider the transcribed characteristics of spam as the antigens and the rules of antispam as the antibodies. Spam has been considered as recognized when the antibodies bind with the corresponding antigens. Imitating the process of the clonal selection, the active antibodies could be cloned and mutated to produce new antibodies. These antibodies can filter new model spam that antispam utilities can’t detect. Assess the spam precision rate and legitimate recall rate synthetically. The result of our research shows that our prototype system has better performance than the SpamAssassin in the experiments of both spam corpus test and daily mail test. Our research concludes that using Immunogenetics as the kernel of AIS can solve the question of multi-object optimization that original Genetic Algorithm can’t. This ‘Adaptability’ will be more efficiency in dealing the new model spam classification.

參考文獻


3. ASRG FAQ, Anti-Spam Research Group (ASRG) of the Internet Research Task Force (IRTF), http://asrg.sp.am/about/faq.shtml
4. Atkinson, J., Mellish, C. and Aitken, S. Combining Information Extraction with Genetic Algorithms for Text Mining IEEE Intelligent Systems May/June 2004, Vol 19, No. 3 pp. 22-30.
9. de Castro, L.N., and Timmis, J.,Artificial Immune System: A New Computational Intelligence Approach. Springer. 2002.
10. Enrique, C., Jose, M.G., and Ali, S.H., Expert Systems and Probabilistic Network Models, Springer-Verlag, New York. 1997.
13. Forrest, S., & Perelson, A., “Genetic algorithms and the immune system.”, In H. Schwefel, & R. Maenner (Eds.), Parallel Problem Solving from Nature, Berlin. Springer-Verlag (Lecture Notes in Computer Science).,1991

被引用紀錄


吳泳慶(2007)。中文垃圾郵件客製化過濾系統之研究〔碩士論文,淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2007.00125
陳奕昌(2008)。利用資料探勘技術建構整合型信用評等最佳化模型〔碩士論文,國立臺北科技大學〕。華藝線上圖書館。https://doi.org/10.6841/NTUT.2008.00201
吳夢潔(2006)。垃圾郵件之傳播與使用行為調查研究〔碩士論文,國立臺灣師範大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0021-0712200716125623
陳蔚之(2006)。免疫演算法於二維連續零件系統之最佳維修策略的研究〔碩士論文,國立虎尾科技大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0028-1501201314421016
廖海崴(2007)。機率性類免疫分類演算法之設計及應用〔碩士論文,元智大學〕。華藝線上圖書館。https://www.airitilibrary.com/Article/Detail?DocID=U0009-0207200701184700

延伸閱讀