一個以樹自動機呈現語意的惡意程式分析架構

惡意程式是指一種有惡意企圖的程式，這種程式可能會執行對使用者或作業系統有害的動作。常見的惡意程式有病毒、蠕蟲、木馬和間諜軟體，它們也是在網際網路上最重大的安全威脅。而使用惡意程式偵測器來偵測惡意程式是目前大家最熟悉的方法。偵測器可以用不同的分析方法來實作，最基本且最流行的方法就是語法式特徵碼比對，而這種方法也廣泛地應用在商業的環境中。但是這種方法並不能有效的偵測更高階的惡意程式，因為高階的惡意程式會透過改變程式的語法結構來躲避偵測器的偵測。然而即使惡意程式的作者改變程式的語法結構來躲過偵測，也不能改變惡意程式本身的語意。因此，現在的惡意程式偵測的研究方向是以基於語意的方法為主。在這篇論文中我們提出一個以語意為中心的惡意程式分析架構，包含監視程式的執行、萃取具有語意的行為以及產生惡意程式偵測器。傳統的惡意程式分析方法大部分都是使用字串當作特徵碼。樹可以比字串呈現更多的語意，因此特徵碼從字串演變成樹是再自然不過了，而我們的架構便是以樹當作特徵碼。首先我們利用沙盒來監視程式的執行並產生執行紀錄的報告，接著利用報告產生行為相依圖並將其轉成樹。最後，使用學習演算法產生三值樹自動機，並以此作為惡意程式偵測器。我們的實驗結果顯示，基於我們提出的架構而實作的雛型工具很有效果並且誤報率低。

關鍵字

惡意程式分析；惡意程式偵測器；沙盒監視；三值樹自動機

並列摘要

Malware (or malicious software) refers to programs that have malicious intents and may perform harmful actions. Common malware includes viruses, worms, trojan horses, and spyware. They represent one of the most notorious security threats on the Internet. Using a malware detector is the most familiar method of defense to deter malware. Each malware detector has its own analysis method, and syntactic signature matching is the most basic and prevalent method used in commercial malware detectors. Unfortunately, this syntactic detection mechanism cannot cope e ectively with advanced malware, which often uses program obfuscation to alter program structures and therefore can avoid the detection easily. On the other hand, although malware writers can use obfuscation to avoid syntactic malware detector, the semantics of a malware instance is usually pre-served after obfuscation. Semantics-based approaches therefore have become the main focus of research on malware analysis. In this thesis, we propose a semantic-centric malware analysis architecture which includes monitoring of malware executions, extraction of semantic behaviors, and gener-ation of malware detectors. Observing recently proposed methods for malware analysis, we notice that string signatures are still used widely. It is a natural evolution from strings to trees, which can exhibit more semantics than strings. Therefore, we adopt trees as signatures. First, we use a sandbox to monitor malware's execution and output reports of execution traces. We then use the execution traces to construct dependency graphs and convert them into trees. Finally, we use a learning algorithm to obtain a 3-valued de-terministic nite tree automaton as a malware detector. Experimental results show that our analysis based on the proposed architecture is e ective and has low false positives.

並列關鍵字

Malware Analysis ； Malware Detector ； Sandbox Monitoring ； 3-Valued Tree Automata

參考文獻

[3] Dana Angluin. Learning regular sets from queries and counterexamples. Inf. Com-

[5] Guillaume Bonfante, Matthieu Kaczmarek, and Jean-Yves Marion. Architecture of a

morphological malware detector. Journal in Computer Virology, 5(3):263{270, 2009.

Yaw Wang. Learning Minimal Separating DFA's for Compositional Veri cation.

[9] Clemens Kolbitsch, Paolo Milani Comparetti, Christopher Kruegel, Engin Kirda,

國際替代計量

一個以樹自動機呈現語意的惡意程式分析架構

主題瀏覽