攻擊程式出現預測：社群媒體(Twitter)情資分析應用

隨著網路基礎設施普及以及資訊系統的廣泛使用，企業或組織曝露在資安風險的機率越來越高。而不時被揭露的軟硬體漏洞更提供了網路犯罪份子開發攻擊程式危害企業組織的管道。漏洞資訊及其討論經常透過網路論壇交流，在社群媒體興起後，更成為資安資訊交換的平台。本研究之目的即在於利用Twitter上發佈討論的漏洞訊息，提前發現可能會被網路罪犯利用開發並進行攻擊的漏洞。本研究除了收集Twitter上的漏洞資訊外，並參考其他資安資源以擴充對漏洞特性的描述；這些資安資源包括：美國國家漏洞數據庫、第三方漏洞平台( CVE Details與VULDB)、ExploitDB以及Microsoft Technet。本研究提出一個三階段的分類方法來預測一個漏洞被利用開發的機率，同時以k-means分群來調整樣本中正反案例的比例，以降低資料(類別)不平衡問題對預測準確度的影響。三階段分類的步驟為：(1)第一階段使用支持向量機(SVM)訓練分類器；(2)SVM測試結果中，被判定為會被實作攻擊碼者之案例，在第二階段用以訊練決策樹分類；(3) 決策樹測試結果為實作攻擊碼者，在第三階段計算其貝氏機率，以作為企業防禦或廠商開發修補程式之依據。

關鍵字

漏洞；資料不平衡；機器學習；分類；支持向量機；決策樹；貝氏機率

並列摘要

As the growth and completeness of networking infrastructure and the popularity of information systems, enterprises and organizations are greatly exposed under information security risk. Software and hardware vulnerabilities that are revealed frequently provide a convenient way for cyber criminals to exploit and attack enterprises or organizations. The publications and discussions of vulnerabilities are frequently found on internet forums; social media have become major platforms for such information exchange after their popularity. The goal of this study is to utilize messages on Twitter regarding vulnerabilities to assess the probability that a vulnerability will be exploited in the real-world. Beside messages on Twitter, information security resources are also used to extract the features of a vulnerability; these resources include: National Vulnerability Database, CVE Details, VulDB, ExploitDB and Microsoft Technet. The study proposes a three-stage classification model to predict the probability that a vulnerability will be exploited, and employs the k-means clustering to adjust the ratio between the positive and negative instances in the sample to alleviate the data (class) imbalance problem during training. The steps of the three-stage classifier are: (1) using support vector machine (SVM) at the first stage training; (2) at the second stage, those instances that are classified as exploited in the testing sample by SVM are further used as training sample of the decision tree classification; (3) the third stage compute the Bayes’ probabilities of those instances which are classified as exploited by decision tree in the testing result. The resulting Bayes’ probabilities serve as a reference for enterprises or vendors to take an appropriate action to a vulnerability.

並列關鍵字

Vulnerability ； data imbalance ； Machine learning ； Classification ； Support vector machine ； Decision tree ； Bayes’ probability

參考文獻

[3] L. Allodi and F. Massacci, “Comparing Vulnerability Severity and Exploits Using Case-Control Studies,” ACM Trans Inf Syst Secur, vol. 17, no. 1, p. 1:1–1:20, Aug. 2014.

[9] J. Bollen, H. Mao, and X. Zeng, “Twitter mood predicts the stock market,” J. Comput. Sci., vol. 2, no. 1, pp. 1–8, Mar. 2011.

[14] F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida, “Detecting spammers on twitter,” in In Collaboration, Electronic messaging, Anti-Abuse and Spam Conference (CEAS, 2010.

[23] E. Bothos, D. Apostolou, and G. Mentzas, “Using Social Media to Predict Future Events with Agent-Based Markets,” IEEE Intell. Syst., vol. 25, no. 6, pp. 50–58, Nov. 2010.

[25] J. A. Ozment, “Vulnerability discovery & software security,” Ph.D., University of Cambridge, 2007.

國際替代計量

攻擊程式出現預測：社群媒體(Twitter)情資分析應用

全文下載

主題瀏覽