一種用於釣魚網站驗證與偵測之方法

在本文中，我們提出一個名為Phishbox的方法，能有效收集釣魚網站資料，並產生用於釣魚驗證與偵測之模型。提出的方法將釣魚網站的收集、驗證與偵測整合成一個工具，可以即時監控PhishTank黑名單上的釣魚網站。由於釣魚網站的生命週期較短，我們提出了兩階段的偵測模型來確保偵測效能。首先，我們設計一個組合式模型來驗證釣魚網站，並應用主動學習降低人工標籤的成本，結果顯示，我們的組合式驗證模型擁有良好的效能，可以達到95%的準確度和3.9%的假陽性率。接著，驗證後的釣魚網站將用於訓練偵測模型。與原始數據相比，釣魚偵測的假陽性率平均下降了43.7%。實際參與PhishTank上的驗證投票，結果顯示兩階段的偵測模型能有效地驗證釣魚網站。最後，我們發現黑名單之中包含大量無效資料。比起PhishTank的定期更新機制，我們的偵測器在一周後能移除約五倍以上的無效網站。

關鍵字

網絡釣魚驗證；網絡釣魚檢測；機器學習；主動學習

並列摘要

In this thesis, we propose an approach, called PhishBox, to effectively collect phishing data and generate models for phishing validation and detection. The proposed approach integrates the phishing websites collection, validation and detection into an on-line tool, which can monitor the blacklist of PhishTank and validate and detect phishing websites in real-time. Due to the short life time of phishing websites, the proposed approach uses a two-stage detection model to ensure the performance. First, we design an ensemble model to validate the phishing data and apply active learning for reducing the cost of manual labeling. The result shows that our ensemble validation model can achieve high performance with 95% accuracy and 3.9% false-positive rate. Next, the validated phishing data will be used to train a detection model. Comparing with the original dataset, the false-positive rate of phishing detection is dropped by 43.7% in average. After participating the voting procedure on PhishTank, the result shows that our two-stage model is effective to verify phishing websites. Finally, we monitor the blacklist and found that the blacklist contains lots of invalid data. According to our experiment, we can remove about five times more than regularly update after one week.

並列關鍵字

phishing validation ； phishing detection ； machine learning ； active learning

參考文獻

[1] B. B. Gupta, A. Tewari, A. K. Jain, and D. P. Agrawal, "Fighting against phishing attacks: state of the art and future challenges," Neural Computing and Applications, pp. 1-26, 2016.

[7] D. G. Dobolyi and A. Abbasi, "PhishMonger: A free and open source public archive of real-world phishing websites," in 2016 IEEE Conference on Intelligence and Security Informatics (ISI), 2016, pp. 31-36.

[8] P. Prakash, M. Kumar, R. R. Kompella, and M. Gupta, "PhishNet: Predictive Blacklisting to Detect Phishing Attacks," in 2010 Proceedings IEEE INFOCOM, 2010, pp. 1-5.

[9] L.-H. Lee, K.-C. Lee, H.-H. Chen, and Y.-H. Tseng, "POSTER: Proactive Blacklist Update for Anti-Phishing," presented at the Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, Arizona, USA, 2014.

[12] A. K. Jain and B. B. Gupta, "A novel approach to protect against phishing attacks at client side using auto-updated white-list," EURASIP Journal on Information Security, journal article vol. 2016, no. 1, p. 9, 2016.

國際替代計量

一種用於釣魚網站驗證與偵測之方法

全文下載

主題瀏覽