透過您的圖書館登入
IP:3.137.187.233
  • 學位論文

在營運網路中多埠網通設備之即時捕捉與重播機制

On-The-Fly Capture and Replay Mechanisms for Multi-port Network Devices in Operational Networks

指導教授 : 林盈達

摘要


利用真實環境測試網路設備可以得到複雜的真實測試流量,但缺點是可能造成網路中斷且錯誤無法重製。而透過重播真實流量測試網路設備可以重製錯誤,但因為流量重播工具的限制以及不完整的待測物狀態重建導致不佳的錯誤重製率。為了保留複雜的測試流量及提升錯誤重製率,我們設計一個新機制,它使用OpenFlow switch對待測物進行自動上/下線與多埠流量重播。當待測物在線上時,此機制對待測物進行監控並捕捉錯誤流量。為了節省空間,我們只捕捉足夠觸發錯誤的封包長度及封包數量。當待測物下線時,便重播錯誤流量以進行錯誤標示。我們針對不同類型的錯誤使用不同的減量方式以有效率地進行錯誤標示。實驗結果顯示,錯誤流量的捕捉只需保留封包的部分內容便可觸發錯誤。針對第二層設備,保留封包前46 bytes就足夠觸發錯誤;而我們的第三層設備只需留下前154 bytes。封包數量則是依測試環境而異。在錯誤標示方面,我們針對封包欄位造成的錯誤及超載造成的錯誤設計減量方式,這個減量方式是以二元搜尋法為基礎。我們提出的減量方式對封包欄位造成的錯誤之縮減比率高達98.8%、超載造成的錯誤可達96%。對於因待測物下線而造成的服務中斷時間,我們發現在監控間距為1秒、容許連續錯誤次數為2次時,進行待測物下線能最有效地降低服務中斷時間。

並列摘要


Testing networking devices in the live environment has complex real traffic, but it may cause network interrupt and cannot reproduce defects. Replaying with real traffic to test networking devices can reproduce defects, but the effectiveness of defect reproduction is not high because of the limitation of replay tools and incomplete reconstruction of DUT (Devices Under Test) states. To keep the high complexity of test traffic and also improve the effectiveness of defect reproduction, we design a new mechanism which can allow DUT to automatically be online/offline and process multi-port replay for multi-port networking devices with an OpenFlow switch. We monitor and capture defect traces when the DUT is online. To save the space, we capture partial payload and limited packet count that are enough to trigger the defects. When we detect the DUT failure, we let the DUT be offline and replay defect trace to identify the defect. For efficient defect identification, we process different reductions for different types of defect. The experimental results show that the partial payload in the packets of captured defect traces can trigger defects. The first 46 bytes is enough for Layer-2 devices and the first 154 bytes is sufficient for our Layer-3 device. The packet count of defect trace depends on the testbed. For defect identification, a reduction based on binary searching algorithm is proposed to deal with defects caused by the payload anomaly and defects caused by the busy condition. The downsizing ratio for defects caused by the payload anomaly is up to 98.8% and the one for defects caused by the busy condition is up to 96%. For the outage time of the failover during the DUT failure, the minimum outage time is obtained when the check interval is 1 second and tolerant consecutive failure time is 2.

參考文獻


[5] Stefan Kornexl, Vern Paxson, Holger Dreger, Anja Feldmann, Robin Sommer, "Building a Time Machine for Efficient Recording and Retrieval of High-Volume Network Traffic," Proc. ACM Internet Measurement Conf., October 2005.
[10] Weibo Chu, Xiaohong Guan, Zhongmin Cai, Lixin Gao, "Real-Time Volume Control for Interactive Network Traffic Replay," Computer Networks, Volume 57, Issue 7, pp. 1611-1629, May 2013.
[11] Ying-Dar Lin, Po-Ching Lin, Tsung-Huan Cheng, I-Wei Chen, Yuan-Cheng Lai, "Low-Storage Capture and Loss-Recovery Selective Replay of Real Flows," IEEE Communications Magazine, Volume 50, Issue 4, pp. 114-121, April 2012.
[13] Weibo Chu, Xiaohong Guan, Zhongmin Cai, Mingxu Chen, "Balance Based Performance Enhancement for Interactive TCP Traffic Replay," IEEE International Conference on Communications, pp. 1-5, May 2010
[14] Chia-Yu Ku, Ying-Dar Lin, Yuan-Cheng Lai, Pei-Hsuan Li, Kate Ching-Ju Lin, "Real Traffic Replay over WLAN with Environment Emulation," IEEE Wireless Communications and Networking Conference, April 2012

延伸閱讀