電子商務蓬勃發展與網際網路使用之普及,藉由電子郵件作為商務往來之聯繫方式,已日漸普遍,大量垃圾郵件之濫發,易造成網路容量之壅塞,電子郵件服務提供者必須耗費龐大人力、物力處理,除將妨礙正常之通信服務外,亦已嚴重影響社會大眾網路使用的環境。 許多研究提出以各種啟發式分類技術來達成垃圾郵件自動過濾,能自動判斷出垃圾郵件的各種變化,然後加以過濾。但為了避開垃圾郵件過濾軟體的偵測及過濾,目前的垃圾郵件已朝向內容稀少、圖片化、減少連結url以避開追查、間歇性少量發送等模式發展,所以未來垃圾郵件的特徵會變得越來越近似正常電子郵件。而目前的分類技術必須要修改並重新判斷垃圾郵件的特徵,持續更新才能有效過濾郵件。 本研究提出以類免疫系統法為核心,建制垃圾郵件過濾系統。類免疫系統是具適應性的系統,將理論的免疫學和觀察到的免疫功能、手法和模型應用在問題解決上。將垃圾郵件特徵視為抗原,並將防疫垃圾郵件的規則視為抗體,當抗原與其對應之抗體可結合時,就代表垃圾郵件被已辨識。仿製人體免疫株落選擇的過程,將被激發且可辨識抗原的抗體予以複製與突變,產生新的抗體,可以適應那些為了避開垃圾郵件過濾軟體偵測及過濾而變種的新型垃圾郵件。 本研究的結果顯示,綜合評比垃圾郵件攔截率及正常郵件誤攔率,本雛型系統在公開資料集與日常郵件的測試上,比現有垃圾郵件過濾軟體,更能達到較好的垃圾郵件攔截率及較低的正常郵件誤攔率。本研究發現以類免疫基因演算為核心架構之類免疫系統,能彌補過去基因演算法系統機制只能針對單純特徵組合求解的缺陷,以此產生「適應性」的情況下在面對新模式郵件的分類品質,有較顯著之適應性。
Because of the popularity of e-commerce and the vigorous growth of the internet networking, email has became the most common way of communication of commercial exchange gradually. This utility has already been abused gradually at present. Receivers have to spend large of time to deal with a large number of commercial emails. Not only cause the waste of time, but also hinder the receiving of the important mail. Due to the bulk sending of commercial emails, the email service provider must consume huge strength and resources to deal with the jams of the internetworking. Besides hindering from the normal service of network connecting, it has also already damaged the using of the public network environment seriously. Previous researches proposed many kinds of the heuristic classification approach for spam filtering. Far more beyond the circumstances in order to avoid from the detecting and filtering of antispam utilities, spammer have already evolved their model into content rareness, graphicalizing, reducing the link of url, interval sending etc. To react these circumstances, those classification approaches still need to modify and update contiguously. Our research proposes a spam filtering prototype system which applying the artificial immune system approach. The design of Artificial Immune system is to imitate the process of the human immune system. According to the human immunity, AIS has the abilities of recognition and adaptation. When pathogens invade the human immunity, macrophages as the vanguard in the frontline compose pathogens into antigens. Consider the transcribed characteristics of spam as the antigens and the rules of antispam as the antibodies. Spam has been considered as recognized when the antibodies bind with the corresponding antigens. Imitating the process of the clonal selection, the active antibodies could be cloned and mutated to produce new antibodies. These antibodies can filter new model spam that antispam utilities can’t detect. Assess the spam precision rate and legitimate recall rate synthetically. The result of our research shows that our prototype system has better performance than the SpamAssassin in the experiments of both spam corpus test and daily mail test. Our research concludes that using Immunogenetics as the kernel of AIS can solve the question of multi-object optimization that original Genetic Algorithm can’t. This ‘Adaptability’ will be more efficiency in dealing the new model spam classification.