基於語彙分析的Botnet惡意域名辨識

個人電腦的運算與連線能力在近年來大幅提升，網際網路上的惡意攻擊發起者不再將主要的攻擊目標放在伺服器，轉而進行Botnet的架設，透過散布電腦蠕蟲、木馬程式的方式攻擊個人電腦，再透過C&C Channel控制受害電腦進行惡意行為的方式獲取利潤，其對網際網路的危害已經是全球性的問題。Botnet在近年來使用了加密流量、Domain fluxing的技術隱藏其流量，為了有效阻擋運用這些技術的Botnet，使用DNS域名黑名單來阻擋Botnet的C&C連線是最有效的Botnet防治策略之一，如何從DNS域名中分辨出Botnet惡意域名是對抗Botnet威脅的重要議題。本文嘗試利用語彙分析方法對DNS域名進行分析，並以不同的特徵組合，運用決策樹模型進行訓練與評估，以找出最適合辨識Botnet惡意域名的語彙特徵組合。我們提出五個主要特徵：域名長度，音節數量，母音數量，母音比例，字元重複數。實驗結果發現，在兩兩比對特徵時，母音比例與母音數量最能區分黑名單與白名單的樣本。在計算誤報與漏報的比率時，也是母音比例與母音數量這個組合效果最好，誤報率只有0.01%，漏報率為4.3%，顯示本文方法可以充分辨識惡意域名。

關鍵字

僵屍網路；機器學習；語彙分析

並列摘要

Personal computer computation power and connection capability dramatically increase such that personal computers becomes malicious attack major targets instead of traditional servers. By using worms and Trojan horses infecting victim personal computers, attackers establish their Botnets, which remote control victims performing malicious activities in order to make money and thus become major Internet threats. Encryption and domain fluxing become current major evading techniques for Botnet. In order to defend Botnet, DNS black list is one of the most effective defensive strategies to hidden Botnet connections. Therefore, effectively identifying malicious domain names is a critical issue for Botnet detection. This paper presents lexical analysis approach to find domain name different patterns, and adopts decision tree models to train the best combination of malicious domain name lexical features. We propose five major features: domain length, syllable count, vowel count, vowel ratio, character redundant. Experiment results show vowel ratio and vowel count can effectively differentiate black and white list samples while cross comparing them. While calculating false positive and negative rates, the combination of vowel ratio and vowel count provides the best results, 0.01% false positive rate and 4.3 % false negative rate, which show our approach can effectively identify malicious domain names.

並列關鍵字

Botnet ； Machine learning ； Lexical analysis

參考文獻

林佳宜, 黃俊穎, 鍾委璋, & 王省閔. (2011). 基於連線錯誤模型的殭屍主機偵測技術. 全國資訊安全會議, 14-23.

鄭孟元, 賴溪松 (2010). Disrupting Peer-to-Peer-based Botnet Communication using Strategic Poisoning: Storm Worm case study.

Choi, H., Lee, H., & Kim, H. (2007). Botnet detection by monitoring group activities in DNS traffic. Paper presented at the IEEE International Conference on Computer and Information Technology.

Alexa the web information company. (2012), from http://www.alexa.com/

Bensoussan, A., Kantarcioglu, M., & Hoe, S. C. (2010). A game-theoretical approach for finding optimal strategies in a botnet defense model. Decision and Game Theory for Security, 135-148.

國際替代計量

基於語彙分析的Botnet惡意域名辨識

未授權

主題瀏覽