透過您的圖書館登入
IP:3.135.183.89
  • 學位論文

異質多核系統之虛擬平台技術

Virtual Platforms for Heterogeneous Multi-core Systems

指導教授 : 洪士灝

摘要


異質系統可以提高系統的計算與能耗效率,但在異質多核心平台上開發軟體往往相當複雜,難以為程式除錯以及監控和分析程式的行為。相較於在真實硬體上開發與分析軟體,全系統模擬提供另一個選擇。在足夠的模擬速度與準確性下,全系統模擬可以有效幫助軟體開發、分析。然而,一個模擬器要同時提供快速的模擬與高準確性是非常困難的,因為這兩項目標示互相矛盾的。所以我們認為一個好的模擬環境不僅要利用各種技術來提高兩者,同時亦須在此之中找到一個平衡點。此外,為了幫助軟體開發,並有效利用系統模擬環境的額外功能,模擬器之上必須支援一個整合的開發環境。 在本論文中,我們提出了一個模擬多核心異質系統的架構,並能充分利用主機上的多核心資源。我們建立了一個可規劃元件多核心系統(Reconfigurable multi-core system )模擬環境,此架構整合了多核心系統模擬器( PQEMU ),快取記憶體模擬器( Ruby cache simulator ),現場可程式化閘陣列( FPGA )模擬器。多核心系統模擬器負責控制所有模擬器的同步機制並蒐集效能相關資訊,快取記憶體模擬器則是模擬處理器的快取記憶體,而現場可程式化閘陣列則是用來負責系統中的特定硬體裝置。為了加速模擬速度,我們採用事件基底( Event-based)的同步方式。 實驗結果顯示,我們所開發的模擬環境具有足夠的模擬速度與準確性幫助開發者了解異質多核心系統特性與行為。此外,我們利用了這個模擬環境來開發真實世界的應用程式並提供了兩個範例研究。第一個範例為利用模擬環境來改善多線程( Multi-threaded )應用程式的效能;第二個範例則是利用全系統的模擬環境來進行現場可程式化閘陣列的應用程式開發。

並列摘要


Even though heterogeneous system can enhance the computation and power efficiency, developing software on an heterogeneous multi-core platform is complicated, since it is relatively hard to debug programs and monitor the performance of the programs running on the actual system. In contrast, full system simulation is also a viable approach, but simulation speed and timing accuracy would be the two important issues which may need to balanced to satisfy the requirements from the developers. Furthermore, to support software development on a heterogeneous multi-core platform with proper modeling of the simulated hardware within a simulator, an unified simulation environment should also be provided. In this thesis, we propose a new heterogeneous multi-core on multi-core simulation framework which is able to leverage the computing resource of multi-core processor on the modern computer. Our framework integrates the PQEMU processor emulator, the Ruby cache simulator and the iVerilog simulator, each is run as an independent process. The PQEMU processor emulator is the central part of our framework, it takes charges of coordinating each simulator and collecting performance information; the Ruby simulator is responsible for modeling memory sub-system; and the iVerilog simulator is responsible for modeling FPGA devices in the system. In order to enhance the emulation speed, we adopt event-based synchronization mechanism to maintain the consistency between each simulators. Furthermore, we also demonstrate how to build such a simulation environment by integrating existing tools. The experimental results demonstrate the proposed framework is capable of predicting performance of applications on reconfigurable multi-core processor systems with sufficient accuracy and emulation speed. We provide two case studies of developing real world application with the proposed framework. the first case study demonstrates the capability of performance debugging for multi-threaded applications; the second case study demonstrates developing FPGA-accelerated applications on full software stacks.

參考文獻


[7] K. Wang, Y. Zhang, X. Shen, and H. Wang, “Parallelization of ibm mambo system simulator
[9] J. Chen, M. Annavaram, and M. Dubois, “Slacksim: a platform for parallel simulations of cmps on cmps,” SIGARCH Comput. Archit. News, vol. 37, no. 2, pp. 20–29, Jul. 2009.
[14] A. Patel, F. Afram, S. Chen, and K. Ghose, “Marss: A full system simulator for multicorex86 cpus,” in Design Automation Conference (DAC), 2011 48th ACM/EDAC/IEEE, june
2011, pp. 1050 –1055.
[16] S. Koehler, J. Curreri, and A. D. George, “Performance analysis challenges and framework for high-performance reconfigurable computing,” Parallel Comput., vol. 34,

延伸閱讀