OpenCL 2.0 模擬器開發及程式特性分析

GPU 在異質系統中的定位，已經從過去的圖形加速器，演變到如今能夠處理各種類型的大量運算，也就是所謂的GPGPU 架構。為了能夠更好地運用GPU 強大的運算能力，在未來的異質系統架構上，CPU 和GPU 將會更加緊密地整合在一起。這種架構上的演進為系統架構研究的領域提供了許多不同設計方向上的可能性，然而因為學術界目前缺乏這樣的CPU 和GPU 整合的異質系統模擬器，直到目前在這個領域上並沒有太多的研究成果。本篇論文將修改一個現有的模擬器gem5-gpu，使其能支援異質運算標準OpenCL 2.0。選擇OpenCL 是因為OpenCL 現今已被各家廠商的硬體所支援，因此我們相信OpenCL 這個標準足夠代表未來的異質系統架構和運算標準。除此之外我們也會在修改過後的模擬器上估量OpenCL 2.0 標準中新增加的功能對程式效能的影響，這些功能包括了動態平行、共享虛擬記憶體和原子運算，它們提供了GPU 更強大的運算功能以及CPU 和GPU 間的資料共享，更能體現異質運算功能的強大。

關鍵字

異質運算； GPGPU 運算； OpenCL ；模擬器

並列摘要

GPU as a computing node in a heterogeneous system, has evolved from an accelerator to a general-purpose computing device that can handle various kinds of tasks. To better utilize the computing power of GPUs, many future heterogeneous systems will integrate CPUs and GPUs more closely. Such heterogeneous system architecture exposes many future architecture research domain, but the lack of a heterogeneous system simulator stops researchers from further exploring this domain. In this thesis, we’ll extend the existing integrated CPU-GPU simulator gem5-gpu to support OpenCL 2.0 standard. We believe that OpenCL as a standard widely adapted by industry will best represent the future design of heterogeneous systems. In addition, we’ll conduct some evaluation on our simulator to see the impact of the new features introduced in OpenCL 2.0. These features including device kernel enqueue, shared virtual memory, and enhanced atomic operations, make GPUs computing capability even stronger and enable the opportunity of fine-grained data sharing between CPUs and GPUs, which can demonstrate the powerfulness of heterogeneous computing.