Non-Uniform Memory Access (NUMA)架構的多核心系統解決了Symmetric Multi-Processing(SMP)架構的擴展性問題,但是,在NUMA系統中,Task的記憶體Page可能會被配置在不同的NUMA節點上,並且當Task在進行跨節點的遠端記憶體存取時,所花費的時間是比存取Task所在節點的記憶體時來得久,因此,遠端記憶體的存取和競爭Interconnect Link會造成NUMA系統的效能下降。儘管Linux核心已經實作了Automatic Page Migration,透過Page Table Invalidation和Page Fault Handling,將Task遠端存取的記憶體Page搬移至當前的節點上,但是卻會造成存取記憶體時的Overhead增加。 為了降低NUMA系統在負載平衡機制中,執行跨節點的Task搬移所造成的遠端記憶體存取,先前的研究提出了記憶體感知負載平衡機制(Memory-Aware Load Balancing, MLB),修改Linux核心的記憶體配置的相關函式,以取得Task在各個不同節點上的記憶體使用量後,在負載平衡機制中加入了Task挑選策略,並且利用Task在各個節點上的記憶體使用量來進行決策,挑選出合適的Task來進行跨節點的搬移,降低Task在搬移過後遠端記憶體存取的發生。 在本篇論文中,我們針對MLB機制中的Task挑選策略進行改良並且新增多個挑選策略,我們增加了許多不同的考量因素,例如最大利益、最佳成本效益、最少共享記憶體Page以及NUMA Factor等等的因素,選擇合適的Task來進行跨節點的搬移,能夠同時降低遠端記憶體存取和提高Task存取本地節點記憶體的機會。這些新的Task挑選策略是實作在Linux核心層次的MLB負載平衡機制內。實驗的結果顯示,我們新提出的Task挑選策略相較於Linux預設的First-Fit挑選策略,能夠有效的降低遠端記憶體存取以及資源競爭,使NUMA系統提高4.91%-7.31%的效能。
Multi-core systems with the Non-Uniform Memory Access (NUMA) architecture solve the scalability problem of the Symmetric Multi-Processing (SMP) architecture. However, in a NUMA system, a task and its memories may be allocated on different NUMA nodes. Processor cores accessing memories on remote NUMA node take longer time than on local node. Remote memory access and contention for inter-node interconnect degrade system performance. Although in the Linux kernel, the automatic page migration mechanism is implemented, it migrates only referenced pages to the current node where the task is running on. Whereas, this mechanism is completed by page table invalidation and page fault handling, which incurs additional overhead for memory access. In order to reduce the remote memory access caused by performing load balancing mechanism in the NUMA system, previous work has proposed the Memory-Aware Load Balancing (MLB) mechanism. In the MLB mechanism, the Linux kernel is modified to keep track of each task’s memory usage on each node. The kernel load balancing mechanism is modified to incorporate with task selection policies. These policies will choose suitable task for inter-node migration according to task’s memory usages on each node. The aim is to reduce remote memory access in the future after task migration. In this thesis, we have improved the task selection policies in the MLB mechanism and proposed several new selection policies. The aims are also reducing the remote memory access and improving the opportunity for a task to access the local node memory. Differently, several factors are taken into account in the proposed policies, such as maximum benefit, best cost effectiveness, least shared page, and NUMA factor. These new task selection policies are implemented with the MLB load balancing mechanism in Linux kernel. The experiment results show that our newly proposed task selection policies can effectively reduce the remote memory access and resource contention. The performance of NUMA system is increased by 4.91% to 7.31% compared to the default Linux kernel that always migrates the first movable task for inter-node load balancing.