透過您的圖書館登入
IP:216.73.216.100
  • 學位論文

基於機器學習演算法之雲端網路應用辨識服務平台

A Cloud based Application Classification Service Platform with Machine Learning Algorithms

指導教授 : 黃能富

摘要


為了在應用程式的通訊開始時進行網路應用流量的辨識,基於本實驗室碩士生翟敬源在2012年度提出的論文研究「基於網路應用程式通訊回合特徵之網路應用程式流量早期分類法」,此論文提出以機器學習為基礎的高準確率演算法「應用層回合制」方法,對於每條TCP/UDP的通訊流,定義了可提供高準確率與適用於流量即時辨識的統計屬性,不過此研究在線上流量辨識率只達到60%。本論文主要貢獻在於改善此研究在線上流量的辨識率與效能。將演算法「應用層回合制」加入了狀態機的概念提升流量辨識率,並加入預流量過濾器提升系統效能。本論文以應用層的角度,針對每條通訊流於應用層通訊開始的協商過程,增加了可取得協商回合特徵的統計屬性,以分類流量。本方法使用C4.5修剪決策樹演算法,針對此演算法,本論文也提出維度增量提升此演算法在做流量分類上的準確度,對校園線上流量的辨識度最多可達到91.2%的準確度,以及平均87.55%的準確率。相較未使用本論文的方法,本論文提出的方法對於以原始線上流量資料較之前的研究高27%到30%的準確率。本論文提出的方法可在短時間內完成流量測試,本論文提出方法的優點是可適用於分類加密流量、具高準確率,並可用於即時流量分類。

並列摘要


The thesis is based on the thesis “On the Cloud-Based Network Traffic Classification and Applications Identification Services” proposed by Master Gin-Yuan-Jai in 2012. The thesis proposed by Master Gin-Yuan-Jai in 2012 proposes a machine learning-based high-accuracy algorithm called “APPlication Round method (APPR)” to identify network application traffic at the early stage. For each TCP/UDP flow, discriminators available at the early stage are determined to support high-accuracy traffic classification. But the accuracy for the real-time traffic classification is only 60%. This thesis proposes some methods for improving the accuracy and the efficacy of the traffic classification for real-time network. And this thesis adds the state machine to APPR for improving the accuracy of the traffic classification. Adding Pre-filter improves the efficacy of the traffic classification. The function of the Pre-filter is filtering the well-known applications previously. Such discriminators characterize the possible negotiation behaviors of each flow from an application layer perspective. By applying a pruned C4.5 tree machine learning algorithm to real traffic trace, this thesis proposes the method to add the dimension of the algorithm for increasing the accuracy of the traffic classification. The accuracy of the real-time campus network is maximal 91.2%, with an average overall accuracy of 87.55%. Compared to the thesis proposed by Master Gin-Yuan-Jai in 2012, the proposed methods provides more than 27% to 30% improvement of overall accuracy for the real-time campus network. Furthermore, the proposed method is also appropriate for identifying encrypted protocols and has the advantages of high accuracy and support for real-time classification.

參考文獻


[5] S. Sen, J. Wang, Analyzing peer-to-peer traffic across large networks, IEEE/ACM Transactions on Networking, 12(2) (2004) 219-232.
[7] Y. Yu, D. Liu, J. Li, C. Shen, Traffic Identification and Overlay Measurement of Skype, in: 2006 Int. Conference on Computational Intelligence and Security (CIS '06), Guangzhou, China, 2006, pp. 1043-1048.
[9] S. Sen, O. Spatscheck, D. Wang, Accurate, scalable in-network identification of p2p traffic using application signatures, in: Proc. the 13th int. conference on World Wide Web (WWW 2004), Manhattan, New York, NY, USA, 2004, pp. 512-521.
[10] F. Constantinou, P. Mavrommatis, Identifying Known and Unknown Peer-to-Peer Traffic, in: Proc. 5th IEEE Int. Symposium on Network Computing and Applications (NCA 2006), Cambridge, MA 2006, pp. 93-102.
[13] Z. Li, R. Yuan, X. Guan, Accurate Classification of the Internet Traffic Based on the SVM Method, in: Proc. IEEE Int. Conference on Communications (ICC '07), Glasgow, Scotland, 2007, pp. 1373-1378.

延伸閱讀