針對不對稱多核心處理器和非均勻訪存模型架構設計的排程器

不對稱多核心處理器(Asymmetric Multicore Processors, AMP)是將一群有著同樣的指令集但不同特徵的中央處理器，例如: 中央處理器頻率、快取大小、耗電量和所占面積…等，組合成一個多核心系統的概念，又名同指令集異構多核心系統(Single-ISA Heterogeneous Multicore System)。這個系統概念的發展主因是隨著中央處理器的速度越來越快，效能越來越好，電力的大量消耗以及散熱成為發展高效能系統所需面對的主要挑戰。此外，經研究證實AMP與同構系統(homogeneous system)相較之下，能在同樣的面積與低電量消耗下有高效能表現。隨著行動裝置的普及，效能與電力之間的取捨是重要的課題，AMP在這方面有著優異的表現，是值得探討的議題。目前作業系統的排程機制假定同一個多核心系統下每個核心是一樣的，因此AMP的架構並不適用於目前的排程機制。且隨著CPU效能快速提高和核心個數增多，記憶體存取需求增多，若過多的執行程序競爭單一記憶體資源，會因等待資料而造成延遲，因此非均勻訪存模型(NUMA)架構增加了硬體與軟體支援來提高存取資料的效率以符合多核心系統，但目前排程機制在非均勻訪存模型(NUMA)架構下效能表現不如預期。綜合以上兩個因素，我提出了一個新的排程策略，來符合AMP和非均勻訪存模型(NUMA)架構。我實作了一個Linux系統下的user-level scheduler，在原來Linux排程機制下加入了兩大策略：(1) Asymmetric-aware Schedule Policy (2) NUMA-aware Schedule Policy，此新的排程機制與原先Linux排程機制相較之下，在PARSEC Benchmarks的測試下平均整體執行效率加快了約1.36倍，與舊文獻相較之下效能提升顯著。

關鍵字

不對稱多核心處理器；同指令集異構多核心系統；多核心；非均勻訪存模型；排程機制

並列摘要

An Asymmetric Multicore Processors (AMP) system is composed of CPUs with different characteristics, e.g. clock speed, cache capacities, power consumption, occupied area and in-order or out-of-order execution, but the same Instruction Set Architecture (ISA). It is named single-ISA heterogeneous multicore as well. Since the frequency and performance of a processor are developed rapidly, the power consumption and the hot dissipation are the challenges of developing high performance system. Compared to the homogeneous system, AMP could reduce an abundant of power consumption while sacrificing a little performance. With the popularity of hand-held devices, the tradeoff between performance and power consumption is an important issue. Luckily, AMP is extraordinary on this issue. Moreover, the non-uniform memory access (NUMA) architecture is a computer memory design used in multiprocessing, and it could avoid the running processes starving for data easily. As the number of cores is getting bigger and bigger, the NUMA architecture is the trend. Existing schedulers all assumed that the underneath architecture is homogeneous, and therefore AMP and NUMA could not work well with it. As a result, we proposed a new scheduler, NUMA-aware Scheduler for Asymmetric Multicore Processor, to accommodate the next generation of architecture. We implemented a user-level scheduler under Linux system based on the prior Linux scheduler. We have two strategies, (1) Asymmetric-aware Schedule Policy and (2) NUMA-aware Schedule Policy. The average speedup of total execution time of PARSEC benchmarks is 1.36 times faster, and the result is quite good compared to the prior studies.

並列關鍵字

asymmetric architecture ； single-ISA heterogeneous architecture ； multicore ； Non-uniform memory access ； scheduling

參考文獻

[3] Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn, "Efficient operating system scheduling for performance-asymmetric multi-core architectures", in Proceedings of the 2007 ACM/IEEE conference on Supercomputing(SC '07).

[6] Juan Carlos Saez, Manuel Prieto, Alexandra Fedorova and Sergey Blagodurov, "A Comprehensive Scheduler for Asymmetric Multicore Processors", in Proceedings of the 5th ACM European Conference on Computer Systems (EuroSys 2010).

[9] Lina Sawalha, Sonya Wolff, Monte P. Tull, and Ronald D. Barnes, "Phase-Guided Scheduling on Single-ISA Heterogeneous Multicore Processors", in Proceedings of the 2011 14th

Euromicro Conference on Digital System Design(DSD '11).

[10] R. Yang, J. Antony, and A. P. Rendell, " A Simple Performance Model for Multithreaded Applications Executing on Non-uniform Memory Access Computers", Proceedings of the 2009 11th IEEE International Conference on High Performance Computing and Communications (HPCC '09 ).

國際替代計量

針對不對稱多核心處理器和非均勻訪存模型架構設計的排程器

主題瀏覽