Register allocation issues on highly distributed register file architectures

在過去的幾年中，所研發的嵌入式處理器已經採用新的硬體設計，以減少不斷增長的複雜性、功耗以及晶片面積。採用分散式暫存器檔案架構，被認為具有較少的讀寫比，但相較於使用傳統的單一暫存器檔案暫存器架構，卻為編譯器技術帶來新的挑戰。具有超長指令集（VLIW）的資料通道架構特色的數位信號處理器（DSP）越來越常使用於處理多媒體應用的嵌入式設備上。為了減少VLIW DSP處理器的功耗與設計成本上，分散式暫存器檔案與複組暫存器架構開始被採用，如此便能減少的讀取和寫入暫存器檔案的端點數目。這樣的方式為編譯器優化帶來新的挑戰。暫存器檔案架構將暫存器劃分成多個組，造成了複雜的溝通機制和窄小的暫存器檔案。複雜的溝通機制需要新的階段來處理它，而窄小的暫存器檔案會增加溢出的次數而降低性能。本論文試圖解決這上述兩個問題。主要有三項研究成果： - 基於局部暫存器檔案配置，我們提出經驗性方法，以局部結果產生全域的暫存器檔案配置。實驗結果表明，提出的方法有效提升了性能。 - 我們點出了在VLIW DSP上分散式暫存器檔案上，減少溢出成本方面的相關議題。傳統上，溢出都是透過記憶體實行的，但多組暫存器檔案架構提供了新的可能，溢出的暫存器數值因此將可以存到不同的暫存器檔案上。在這方法裡，我們善加利用暫存器檔案作為溢出的目的地。 - 為了減少在暫存器檔案配置階段所產生的溢出，我們試圖從兩個方面，估計溢出所需的成本：配置、溢出。實驗報告指出，該方法不僅減少了溢出率，同時也增加了效能。所有實驗皆操作於以Open64為基礎的編譯器，實驗結果顯示每個方法的有效性。

關鍵字

分散式暫存器檔案；暫存器分配；暫存器檔案配置；溢出方法；編譯器最佳化

並列摘要

Embedded processors developed within the past few years have employed novel hardware designs to reduce the ever-growing complexity, power dissipation, and die area. While using distributed register file architecture is considered to have less read/write ports than using traditional unified register file structures, it presents challenges in compilation techniques to generate efficient codes for such architectures. Digital signal processors (DSPs) with very long instruction word (VLIW) data-path architectures are increasingly being deployed on embedded devices for multimedia processing applications. To reduce the power consumption and design cost of VLIW DSP processors, distributed register files and multi-bank register architectures are bein adopted to reduce the number of read and write ports associated with register files, which presents new challenges for devising compiler optimization schemes. Distributed register file architectures divide registers into multiple sets, and it leads to complicated communication and small register files. Complicated communication requires a new phase to handle it. Small register files increase spilling and reduce performance. The dissertation attempts to resolve these two issues. There are three primary results: - A heuristic method is proposed for global register file assignment making suitable decisions based on local register file assignment. The experimental results indicate that the compilation based on our proposed approach delivers performance improvements. - We address the issues of reducing the spill cost for a VLIW DSP with distributed register files. Spill code produced by register allocation is traditionally handled by memory spills, but the multibank register-file architecture provides the opportunity to spill-out register values onto different register banks. We present a framework to model the live ranges in different register banks, and treats register banks as optional spilling locations. - To reduce spilling possibly produced from the phase of register file assignment, we propose a method which attempts to improve spilling by estimating the spilling cost from two aspects: assignment and spilling. We report that the SPIFR method not only reduces spilling ratios but increases the performances. The results of all experiments performed using our optimizing compiler based on the Open64. The results of experiments showed the effectiveness of each of my methods.

並列關鍵字

distributed register file ； register allocation ； register file assignment ； spilling ； compiler optimization

參考文獻

[7] Preston Briggs. Register allocation via graph coloring. PhD thesis, Rice University, Houston, TX, 1992.

[12] Gregory J. Chaitin. Register allocation and spilling via graph coloring. In Proceedings of the ACM SIGPLAN 1982 Symposium on Compiler Construction, pages 201–207, 1982.

[13] Gregory J. Chaitin, Marc A. Auslander, Ashok K. Chandra, John Cocke, Martin E. Hopkins, and Peter W. Markstein. Register allocation via coloring. Computer Languages, 6(1):47–57, 1981.

ow for embedded VLIW DSP processors with distributed register files. ACM SIGPLAN Notices, Volume 42, Issue 7:146–148, 2007. ACM LCTES 2007 Issue.

[19] Matthew R. Guthaus, Jeffrey S. Ringenberg, Dan Ernst, Todd M. Austin, Trevor Mudge, and Richard B. Brown. MiBench: A free, commercially representative embedded benchmark suite. In Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, December 2001.

國際替代計量

全文下載

主題瀏覽