透過您的圖書館登入
IP:18.118.193.108
  • 學位論文

一個以樹自動機呈現語意的惡意程式分析架構

A Semantics-Centric Architecture for Malware Analysis Based on Tree Automata

指導教授 : 蔡益坤
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


惡意程式是指一種有惡意企圖的程式,這種程式可能會執行對使用者或作業系統有害的動作。常見的惡意程式有病毒、蠕蟲、木馬和間諜軟體,它們也是在網際網路上最重大的安全威脅。而使用惡意程式偵測器來偵測惡意程式是目前大家最熟悉的方法。偵測器可以用不同的分析方法來實作,最基本且最流行的方法就是語法式特徵碼比對,而這種方法也廣泛地應用在商業的環境中。但是這種方法並不能有效的偵測更高階的惡意程式,因為高階的惡意程式會透過改變程式的語法結構來躲避偵測器的偵測。然而即使惡意程式的作者改變程式的語法結構來躲過偵測,也不能改變惡意程式本身的語意。因此,現在的惡意程式偵測的研究方向是以基於語意的方法為主。 在這篇論文中我們提出一個以語意為中心的惡意程式分析架構,包含監視程式的執行、萃取具有語意的行為以及產生惡意程式偵測器。傳統的惡意程式分析方法大部分都是使用字串當作特徵碼。樹可以比字串呈現更多的語意,因此特徵碼從字串演變成樹是再自然不過了,而我們的架構便是以樹當作特徵碼。首先我們利用沙盒來監視程式的執行並產生執行紀錄的報告,接著利用報告產生行為相依圖並將其轉成樹。最後,使用學習演算法產生三值樹自動機,並以此作為惡意程式偵測器。我們的實驗結果顯示,基於我們提出的架構而實作的雛型工具很有效果並且誤報率低。

並列摘要


Malware (or malicious software) refers to programs that have malicious intents and may perform harmful actions. Common malware includes viruses, worms, trojan horses, and spyware. They represent one of the most notorious security threats on the Internet. Using a malware detector is the most familiar method of defense to deter malware. Each malware detector has its own analysis method, and syntactic signature matching is the most basic and prevalent method used in commercial malware detectors. Unfortunately, this syntactic detection mechanism cannot cope e ectively with advanced malware, which often uses program obfuscation to alter program structures and therefore can avoid the detection easily. On the other hand, although malware writers can use obfuscation to avoid syntactic malware detector, the semantics of a malware instance is usually pre-served after obfuscation. Semantics-based approaches therefore have become the main focus of research on malware analysis. In this thesis, we propose a semantic-centric malware analysis architecture which includes monitoring of malware executions, extraction of semantic behaviors, and gener-ation of malware detectors. Observing recently proposed methods for malware analysis, we notice that string signatures are still used widely. It is a natural evolution from strings to trees, which can exhibit more semantics than strings. Therefore, we adopt trees as signatures. First, we use a sandbox to monitor malware's execution and output reports of execution traces. We then use the execution traces to construct dependency graphs and convert them into trees. Finally, we use a learning algorithm to obtain a 3-valued de-terministic nite tree automaton as a malware detector. Experimental results show that our analysis based on the proposed architecture is e ective and has low false positives.

參考文獻


[3] Dana Angluin. Learning regular sets from queries and counterexamples. Inf. Com-
[5] Guillaume Bonfante, Matthieu Kaczmarek, and Jean-Yves Marion. Architecture of a
morphological malware detector. Journal in Computer Virology, 5(3):263{270, 2009.
Yaw Wang. Learning Minimal Separating DFA's for Compositional Veri cation.
[9] Clemens Kolbitsch, Paolo Milani Comparetti, Christopher Kruegel, Engin Kirda,

延伸閱讀