基於結構相似度之惡意程式原始碼分類研究

面對日益複雜的進階持續性滲透攻擊（Advanced Persistent Threat），惡意軟體分類為數位鑑識中最重要的一環。正確的惡意軟體分類可以得到惡意軟體最完整的系統行為，並且簡化鑑識之分析工作。傳統的惡意軟體分類著重於執行後之動態分析或者是以逆向工程結合靜態分析的方式，試圖取得惡意軟體的系統行為資訊，但惡意軟體會透過反虛擬機器監控和混淆技術來降低分類的正確率。隨著誘捕系統愈來愈健全，誘捕系統所蒐集到的惡意軟體原始碼也日漸增加，藉由分析惡意軟體的原始碼可以得到最正確的惡意軟體分類，因此本論文提出一個自動化惡意軟體分類機制。本論文藉由誘捕系統所擷取之惡意軟體原始碼，利用惡意軟體檔案結構相似度以及原始碼檔案相似度，透過階層式分群演算法（Hierarchical Clustering Algorithmn）之方法，不但可以正確的將新捕捉到的惡意軟體分類到正確的類別也可以快速地找出新類型的惡意軟體。本論文提出的方式可以大幅度減少數位鑑識者針對同一類型的惡意軟體重複進行高成本的分析，亦可在最短時間內了解攻擊者行為以及意圖。透過實驗證明，本論文所提出的系統可以將惡意軟體原始碼做正確的分類，而本論文所提出的方法亦可應用於其他有原始碼分類需求的領域。

關鍵字

惡意軟體分類；靜態分析；結構相似度

並列摘要

In the face of APT (Advanced Persistent Threat), malware classification is one of the promising solutions in the field of digital forensics. In previous literature, researchers performed dynamic analysis or static analysis after reverse engineering. In the other hand, malware developers even use anti-VM and obfuscation techniques try to evade malware classifiers.Honeypots are increasingly deployed throughout different networks; malware source code is collected and unclassified. Source code analysis provides a better classification for forensics. In this paper, a novel classification approach is proposed, based on logic similarity and directory structure similarity. Hierarchical clustering algorithm finds the best fit classification for each testing data and creates one if none fits well. New type of malware could be identified and then analyzed further. Such classification avoids to re-analyze known malware and allocates resources for new malware. The experimental results demonstrate that the proposed system can classify the malware effectively with a small mis-classification ratio.

並列關鍵字

Malware classification ； static analysis ； structure similarity

參考文獻

Altaher, A., Supriyanto, ALmomani, A., Anbar, M., & Ramadass, S. (2012). Malware detection based on evolving clustering method for classification. Scientific Research and Essays, 7(22), 2031-2036.

Bergroth, L., Hakonen, H., & Raita, T. (2000). A survey of longest common subsequence algorithms. Seventh International Symposium on String Processing and Information Retrieval (SPIRE 2000), A Curuña, Spain.

Damerau, F. J. (1964). A technique for computer detection and correction of spelling errors. Communications of the ACM, 7(3), 171-176.

Hamming, R. W. (1950). Error detecting and error correcting codes. Bell System Technical Journal, 29(2), 147-160.

Kolter, J. Z., & Maloof, M. A. (2006). Learning to detect and classify malicious executables in the wild. Journal of Machine Learning Research, 7, 2721-2744.

被引用紀錄

王雅詩（2017）。基於詞性組合的意見字典擴增方法之研究〔碩士論文，淡江大學〕。華藝線上圖書館。https://doi.org/10.6846/TKU.2017.00608

國際替代計量

基於結構相似度之惡意程式原始碼分類研究

全文下載

主題瀏覽