透過您的圖書館登入
IP:18.220.187.178
  • 學位論文

具有針對向量最佳化之向量資料流分析之開放計算語言編譯器

An OpenCL Compiler Framework with Vector Data Flow Analysis for SIMD Optimizations on CPUs+GPUs

指導教授 : 李政崑
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


異質多核心平台已被廣泛的使用於嵌入式系統與高效能計算領域,由於異質多核心平台結合了多個不同架構的處理器於同一個平台,因此會需要非常先進的軟體開發框架來協助軟體的開發,Open Computing Language (OpenCL)為目前廣泛使用於異質多核心平台的軟體開發框架,在OpenCL的組成中,其後端編譯器扮演著非常重要的角色,它負責將使用者的程式編譯成支援多個不同架構處理器的執行檔,目前常見的OpenCL開發工具大多利用LLVM作為其後端編譯器,此現象衍生出一個有趣的問題:OpenCL是否可以在其他編譯器上被有效率的實作出來?在其他編譯器上實現OpenCL不論在學術研究上或是OpenCL本身的發展上可望帶來更多革新,此外OpenCL原生支援的向量運算亦使得編譯器需要不同於以往的資料流分析方法。 此論文提出了一套在Open64編譯器上支援OpenCL的方法,Open64本身具備許多著名的最佳化技術,在Open64上支援OpenCL可使這些最佳化技術有機會被利用在OpenCL的程式上,此論文支援OpenCL的方法將涵蓋Open64編譯器的前端、中端與後端等重要部分,在實現OpenCL的支援後,我們亦在Open64上實現了支援OpenCL向量運算的資料流分析方法,並且基於此向量運算資料流分析方法更進一步的提出適用於OpenCL向量運算的編譯器最佳化技術。 最後我們進行了一系列的實驗,實驗結果顯示,利用Open64所開發的OpenCL編譯器可成功編譯並執行多個來自AMD APP SDK的測試程式,而我們所提出的OpenCL向量運算編譯器最佳化技術,可於x86 CPUs與AMD GPUs的環境下分別帶來22%與4%的效能提升,此結果證明利用Open64作為一套OpenCL編譯器為一可行的方法,同時亦可利用Open64來開發OpenCL相關的編譯器最佳化技術。

並列摘要


The use of heterogeneous multi-core platforms for both embedded and high-performance computing is becoming widespread. The integration of processors of different types means that these platforms require novel frameworks for supporting the development of software for them. Open Computing Language (OpenCL) is a commonly used framework for programming on heterogeneous multi-core platforms. One of the most important parts of OpenCL is the back-end compiler that compiles OpenCL programs for different processors. Most OpenCL compilers currently utilize LLVM as their compiler infrastructure, which presents an interesting question: Can OpenCL be effectively implemented on other compiler infrastructures? Supporting OpenCL on other compiler infrastructures could provide the opportunity to incorporate more academic innovations in the development of OpenCL and its applications. The support of single-instruction multiple-data (SIMD) linguistics of OpenCL also requires special compiler data flow analysis to meet the optimization requirements. Here we describe a method to apply an OpenCL compiler based on the Open64 compiler infrastructure to AMD graphics processing units (GPUs) and x86 CPUs. Open64 is equipped with many legacy compiler optimizations, supporting OpenCL on Open64 provides the potential for these legacy optimizations to be applied to OpenCL programs. The required procedures are detailed herein for the front-end, middle-end, and back-end of the Open64 compiler. We then propose a calculus framework to support the data flow analysis of vector constructs for OpenCL programs that compilers can use to perform SIMD optimizations. We model OpenCL vector operations as data access functions in the style of mathematical functions. We then show that the data flow analysis for OpenCL vector linguistics can be performed based on the data access functions. Based on the information gathered from data flow analysis, we illustrate a set of SIMD optimizations on OpenCL programs. Preliminary experimental results have demonstrated that the Open64-based OpenCL compiler can successfully compile fifteen benchmarks from the AMD APP SDK. Executing the compiled programs on the AMD GPU platform also produces correct results. The experimental results incorporating our calculus and our proposed compiler optimizations show that the proposed SIMD optimizations can provide average performance improvements of 22% on x86 CPUs and 4% on AMD GPUs. For the selected fifteen benchmarks, eleven of them are improved on x86 CPUs and six of them are improved on AMD GPUs. These results demonstrate the potential to adopt Open64 as an alternative OpenCL compiler as well as develop OpenCL SIMD optimizations on Open64.

參考文獻


[47] Aho AV, Sethi R, Ullman JD. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986.
[46] AMDAPP Kernel Analyzer. http://developer.amd.com/tools-and-sdks/archive/appkernel-analyzer [29 August 2014].

延伸閱讀