為了在應用程式的通訊開始時進行網路應用流量的辨識,基於本實驗室碩士生翟敬源在2012年度提出的論文研究「基於網路應用程式通訊回合特徵之網路應用程式流量早期分類法」,此論文提出以機器學習為基礎的高準確率演算法「應用層回合制」方法,對於每條TCP/UDP的通訊流,定義了可提供高準確率與適用於流量即時辨識的統計屬性,不過此研究在線上流量辨識率只達到60%。本論文主要貢獻在於改善此研究在線上流量的辨識率與效能。將演算法「應用層回合制」加入了狀態機的概念提升流量辨識率,並加入預流量過濾器提升系統效能。本論文以應用層的角度,針對每條通訊流於應用層通訊開始的協商過程,增加了可取得協商回合特徵的統計屬性,以分類流量。本方法使用C4.5修剪決策樹演算法,針對此演算法,本論文也提出維度增量提升此演算法在做流量分類上的準確度,對校園線上流量的辨識度最多可達到91.2%的準確度,以及平均87.55%的準確率。相較未使用本論文的方法,本論文提出的方法對於以原始線上流量資料較之前的研究高27%到30%的準確率。本論文提出的方法可在短時間內完成流量測試,本論文提出方法的優點是可適用於分類加密流量、具高準確率,並可用於即時流量分類。
The thesis is based on the thesis “On the Cloud-Based Network Traffic Classification and Applications Identification Services” proposed by Master Gin-Yuan-Jai in 2012. The thesis proposed by Master Gin-Yuan-Jai in 2012 proposes a machine learning-based high-accuracy algorithm called “APPlication Round method (APPR)” to identify network application traffic at the early stage. For each TCP/UDP flow, discriminators available at the early stage are determined to support high-accuracy traffic classification. But the accuracy for the real-time traffic classification is only 60%. This thesis proposes some methods for improving the accuracy and the efficacy of the traffic classification for real-time network. And this thesis adds the state machine to APPR for improving the accuracy of the traffic classification. Adding Pre-filter improves the efficacy of the traffic classification. The function of the Pre-filter is filtering the well-known applications previously. Such discriminators characterize the possible negotiation behaviors of each flow from an application layer perspective. By applying a pruned C4.5 tree machine learning algorithm to real traffic trace, this thesis proposes the method to add the dimension of the algorithm for increasing the accuracy of the traffic classification. The accuracy of the real-time campus network is maximal 91.2%, with an average overall accuracy of 87.55%. Compared to the thesis proposed by Master Gin-Yuan-Jai in 2012, the proposed methods provides more than 27% to 30% improvement of overall accuracy for the real-time campus network. Furthermore, the proposed method is also appropriate for identifying encrypted protocols and has the advantages of high accuracy and support for real-time classification.