近年來,記憶體密集型工作在大規模和以數據為中心的應用程序中被廣泛的使用。不幸的是,當這些應用程序的工作集無法得到足夠的實際記憶體空間時,其執行的效能減少得非常多,這樣的現象會影響服務質量和用戶體驗。 而藉由現今高速的內部網路,提供了可行的解決方案使得使用節點之間閒置的實體記憶體在微秒等級的延遲和擁有50 Gbps的頻寬,以解決上述的問題。 在本論文中,我們測量了遠端記憶體頁面交換的延遲並分析了軟體層所造成的延遲消耗。通過分析,我們建立了一個詳細,現實且高度可配置的時間模型,這個時間模型可以用於評估遠端頁面交換架構。 給定一個應用程式和系統架構,時間模型就可以估計遠端頁面交換的效果,是一個可以提供給集群設計的指標。要如何分析和測量頁面錯誤的延遲時間困難點有二,首先,頁面錯誤是由Linux內核處理的異常訊號。第二,在Linux和硬件設備中,沒有直接和準確的方法來衡量頁面錯誤造成的延遲。 通過使用我們提出的方法,使用者可以分析Linux內核的頁面錯誤處理者和交換硬體所造成的效能損失。 在我們的環境實驗中,我們發現每個主頁故障的平均延遲是100.27us,使用HDD和20.975us使用RDMA。 通過使用在應用程式運行時收集的測量記錄,時間模型可以準確預估在使用HDD和RDMA這兩種交換硬體時,頁面交換上的時間,並且錯誤率在15%以下。
The performance of the memory-intensive workloads deteriorate heavily when their working sets do not fully fit in the physical memory of a server, which impacts the quality of service and user experience. Today's high-speed interconnection network, it is possible to implement a remote memory swapping mechanism for a node to use idle memories in other nodes with access latency in the order of microseconds which provides a viable solution to the aforementioned problem. In this thesis, we measure the latency of remote memory swapping and analyze the overhead incurred in the software layers. With the analysis, we develop a detailed, realistic and highly configurable timing model for evaluating remote swapping architecture. Given an application workload and a system architecture, such as timing model can be used to the evaluating the effectiveness of remote swapping and guide the design of the cluster. There are two main points that are hard to analyze and measure the penalty of the page faults. First, the page fault is an exception which is handled by Linux kernel. Second, there is no direct and accurate way to measure the penalty that caused by page faults both in Linux and hardware devices. In our experiments, we found the average latency of each major page fault is 100.27us using HDD and 20.975us using RDMA. The timing models we proposed are capable of modeling the swapping time for the cases of HDD and RDMA with an error rate of less than 15%.