透過您的圖書館登入
IP:3.147.89.24
  • 學位論文

使用高效能遠端虛擬記憶體聚合未使用之記憶體空間

Aggregating Unused Memory with Efficient Remote Swapping

指導教授 : 洪士灝
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


近年來因為大數據分析的需求,資料中心與叢集內部高速網路效能朝向高頻寬、低延遲的方向急遽成長。比起過去,利用高速網路技術共享伺服器間的資源的效率變得前所未有地高。在這篇論文中,我們研究透過高速內部網路伺服器間遠端共享記憶體的技術,透過虛擬記憶體系統將遠端伺服器的記憶體作為交換空間使用。我們提出以的高度可移植的交換記憶體架構,是使用產業標準的開放原始碼系統程式和硬體架構所完成。故只需要進行軟硬體的配置與安裝,不需要對作業系統加上特殊的修改,我們相信這樣的架構可以廣泛應用在各領域中。 我們用簡單的測試程式與實際的應用作為此遠端記憶體機制的驗證,包含記憶體內快取系統(memcached)、訓練機械學習模型(Tensorflow)與基因定序(MUMmer)於50Gbps的高速乙太網路環境。實驗結果顯示我們的機制可以提供高效率的遠端虛擬記憶體存取:第一,跟使用傳統硬碟的記憶體交換機制比起來,我們不需要為了避免虛擬記憶體的猛移現象(thrashing)而付出高昂的記憶體採購成本。舉例來說,使用我們的方式,TensorFlow訓練深度學習模型時,效能可以僅低於使用足夠放進所有資料的實體記憶體時1.24倍、並發揮比傳統硬碟快16倍的效能。最後,因為使用RDMA over Converged Ethernet(RoCE)網路,我們的機制只造成雙方伺服器間極低的間接成本。

關鍵字

交換空間 遠端交換空間 RDMA RoCE

並列摘要


In recent years, the performance of interconnection networks in the datacenter have been vastly improved with higher bandwidth and lower latency, driven by the demand of big data analytics. With the high-speed network technologies, sharing of resources among different servers becomes more efficient than ever. In this thesis, we study remote swap memory technologies which allows one server to utilize the memory on a remote server as the swap memory for the virtual memory system via a high-speed interconnection network. We propose a portable remote memory swap mechanism with reliable open-source system software and industrial standard hardware components. The construction of the mechanism is done by configuring the software and hardware beyond the operating system without level vendor-specific modifications, so we believe the methodology is generic and is useful to a wide range of applications. To evaluate the performance of our proposed mechanism, we carry out microbenchmarks as well as realistic applications, including in-memory cache (memcached), machine learning model training (Tensorflow), and genome sequence alignment (MUMmer), on a setup with two servers connected via 50Gbps Ethernet. The experimental results show the efficiency of the our mechanism. First, the remote memory swap mechanism is faster than traditional memory swap mechanism with hard disks and saves the cost of adding physical memory to avoid thrashing of virtual memory. For example, using our remote swap mechanism, TensorFlow training deep learning models was accelerated by 16 times, compared to swapping using local disk. It ran only 1.24 times slower than running on a server with larger physical memory to hold the entire data set. Finally, due to the use of RDMA over Converged Ethernet (RoCE), our remote memory swap mechanism caused little overhead to both servers.

並列關鍵字

Swap Remote Swap RDMA RoCE

參考文獻


[3] A. Badam. Bridging the Memory-Storage Gap. PhD thesis, October 2012.
[5] David A Rusling. Linux memory management. http://www.tldp.org/LDP/tlk/mm/memory.html.
[12] S. Liang, R. Noronha, and D. K. Panda. Swapping to remote memory over infiniband: An approach using a high performance network block device. In 2005 IEEE International Conference on Cluster Computing, pages 1–10, Sept 2005.
[23] M. C. Schatz, C. Trapnell, A. L. Delcher, and A. Varshney. High-throughput sequence alignment using graphics processing units. BMC Bioinformatics, 8:474 – 474, 2007/// 2007.
[1] TensorFlow Models. https://github.com/tensorflow/models.

延伸閱讀