Affine calculator程式平行化計算器實作於GCC編譯器

雖然多核心處理器架構越來越普遍，但是在這種架構上撰寫程式並有效發揮每個處理單元的運算能力仍然是一件困難的事情。數個研究主題目標在於解決此類問題，自動平行化即是其中強而有力但同時也是相當困難的解決方法之一。 Affine partitioning 提供了一種有系統的方法，能夠在多核心處理器的架構上找到近似於最佳的計算與資料分解法。 Affine partitioning 是一個強大的統一常論，它的架構融合了許多高階的優化方法，例如: 迴圈調換，迴圈反轉，迴圈融合，迴圈分離等許多方法。基於這個統一的架構，能夠在一個由仿射資料參考所組成的巢狀迴圈中找到最大的平行度同時將程式中的通訊降到最小。在這篇論文中，我們提出一個GCC編譯器中的分析過程，我們稱為” affine calculator pass”，這個分析過程將affine partitioning 演算法和GCC編譯器整合，達到了自動平型化的目標。我們成功的編譯了數個以C語言撰寫的程式，這些程式由包含不同型態的陣列存取以及不同類型的資料相依的巢狀迴圈所組成。完成編譯後輸出的是一個執行檔，每個執行檔都使用了GOMP 函式庫來達到充分利用處理器的每個運算處理單元。

關鍵字

自動平行化；編譯器

並列摘要

Multicore processors have become pervasive in these days. But, it is still difficult to program these architectures to effectively utilize the computation power of multiple processing units. There are several research topics addressing this issue, one of the strong and at the same time very hard approaches is automatic parallelization. Affine partitioning provides a systematic framework to find asymptotically optimal computation and data decomposition for multicore processors. Affine partitioning is a powerful unifying theory, the framework uniformly models a large class of high level optimizations such as loop interchange, reversal, skewing, fusion, fission, re-indexing, scaling. Based on this unified framework, that maximize parallelism while minimizing communication in programs with arbitrary loop nestings and affine data accesses. In this papaer, we proposed a compiler pass in GCC called “affine calculator pass” which integrates affine partitioning framework and GCC to achieve the goal of automatic parallelization. We have successfully compiled some programs in C language which composed of different types of array access dependencies in arbitrary nested loop. And the outputs of the compilation are executables. Each of these executables can execute in parallel using GOMP library to utilize the processing units of multicore processor.

並列關鍵字

Auto parallelization ； Compiler ； GCC ； Affine partitioning ； OpenMP

參考文獻

[2] A. Lim, S. Liao, and M. Lam. Blocking and array contraction across arbitrarily nested loops using affine partitioning. In ACM SIGPLAN PPoPP, pages 103–112, 2001.

[3] A. W. Lim, G. I. Cheong, and M. S. Lam. An affine partitioning algorithm to maximize parallelism and minimize communication. In ACM Intl. Conf. on Supercomputing, pages 228–237, 1999.

[4] A. W. Lim and M. S. Lam. Maximizing parallelism and minimizing synchronization with affine partitions. Parallel Computing, 24(3-4):445–475, 1998.

[6] C. Bastoul. Code generation in the polyhedral model is easier than you think. In IEEE Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT’04), pages 7–16, Sept. 2004.

[9] U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In Intl. Conf. on Compiler Construction(ETAPS CC), Apr. 2008.

國際替代計量

Affine calculator程式平行化計算器實作於GCC編譯器

全文下載

主題瀏覽