透過您的圖書館登入
IP:3.128.198.21
  • 學位論文

網絡流量分佈的分類和估計及其應用

Network Traffic Distribution Classification and Estimation with Applications

指導教授 : 蘇育德

摘要


網絡流量分類是許多網絡管理應用的主要課題,我們可以使用各種參數量統計來進行分類。本論文聚焦於封包延遲變異(packet delay variation,PDV),此延遲值的機率分布可用來描述網絡中某部分的流量負載。我們考慮兩類常見的 PDV 模型:第一類假設機率分佈遵循Erlang分佈,並將其形狀參數及流量率參數分為八種;第二類則透過混合Weibull分佈來建模,並依混合係數(mixing coefficients)之有無來細分七種可能。我們分別對兩類模型提出同時模型識別(model identification)與參數估計的解決方案,並解此來估計封包延遲的機率密度函數與網路兩端間的時鐘。 對於第一類延遲模型,在時鐘偏差非零的情況下,我們開發迭代算法來估計每個模型的相關參數,然後使用所估計的參數值來計算每個模型的似然(likelihood)值。有最大似然值的參數模型與相應的時鐘偏差值即是我們要的解答。 至於第二類模型,我們則應用動差匹配(moment matching)的概念來同時執行混合係數、Weibull參數以及時鐘偏差的估計。我們藉由修改交叉熵(cross entropy,CE)方法來解決相關的非線性多變量且包括混合係數估計相關的「最小平方匹配誤差和(minimum squared matching error sum)」之優化問題。另一種方法則是在一開始先透過Dirichlet分佈估計混合係數,再運用CE法來估計剩下的參數。 由於有部分混合係數為零時,上述方法並不能對第二類模型產生令人滿意的估計結果,因此我們根據不同的混合係數組合將其細分為七個模型。對每個模型我們再以 CE法來估計相關參數並得到延遲的PDF,接著計算此PDF與根據延遲樣本所估計的無母數(non-parametric) PDF之KL散度(Kullback-Leibler divergence)。有最小KL散度的 PDF與其對應的參數值(含時鐘偏差)即是我們的答案。

並列摘要


Network traffic classification and estimation is a major issue in many network management applications. Various statistics of the traffic can be used for classification applications. We are interested in a particular network traffic parameter, namely the packet (routing) delay variations (PDVs). We consider two general classes of statistical PDV models. The first class assumes that the PD follows an Erlang distribution whose shape and rate parameters belong to eight sets. The second class describes the PD by a mixed-Weibull distribution which yields different traffic classes depending on the presence or absence of the mixing coefficients. For each class, we present joint model identification and parameter estimation methods that yield a PDV probability density function (PDF). We then apply these methods to solve the problem of network clock synchronization in the presence of unknown PDV. For each model of the first class, we develop an iterative algorithm to estimate the associated parameters (and clock offset if nonzero). Using the estimated parameter values, we then compute the likelihood of every model and the one with the largest likelihood becomes our model while the corresponding parameter (the clock offset included) is our output. For the second class, we invoke the concept of moment matching to perform joint mixing coefficients, Weibull parameters, and/or clock offset estimation. We use the cross-entropy (CE) method to solve the associated nonlinear multi-variable (including mixing coefficients) optimization problem-that of minimizing the sum of squared matching errors. An alternate approach that starts with the estimation of the mixing coefficients via the Dirichlet distribution method and followed by a CE-based estimation of the remaining parameters is also presented. When some of the mixing coefficients are zero, the above methods do not give satisfactory estimates. We thus divide the channel into seven models according to the candidate mixing coefficient combinations. For each model, we apply the CE method to estimate the parameters of concern, then compute the Kullback-Leibler (KL) divergence between the PDF estimated by sample delays and that specified by the estimated clock offset and model parameters. The parameter values and the PDF associated with the one with the minimum KL distance are our final estimate.

參考文獻


[1] A. Dainotti, A. Pescape, and K. C. Clay, "Issues and future directions in traffic classication," IEEE Network, vol.26,no.1,pp.35-40,2012.
[2] G. Srivastava, M. Singh, P. Kumar, and J. Singh, "Internet traffic classification:A survey," in Proc. Recent Advances in Mathematics, Statistics and Computer Science, pp. 611-620, Bihar, India,May 2016.
[3] A. W. Moore and D. Zuev, "Internet traffic classification using bayesian analysis techniques,"in Proc. ACMSIGMETRICS'05, vol.33,pp.50-60,Ban,Alberta,
Canada, Jun. 2005.
[4] A. S. Tanenbaum and M. VanSteen, Distributed systems: principles and paradigms. Prentice Hall,USA,2007.

延伸閱讀