多核心電腦系統之架構探索

由於電腦技術不斷演進，人們仰賴高效能電腦系統處理愈趨複雜的科學計算，癌症基因比對就是一個常見應用，通常這類別的程式為大量輸入資訊或複雜數學運算，需耗費大量時間產生結果，但研究成果又需要盡早完成以利於病症探索。因此，如何使程式以更短的執行時間完成就仰賴整體電腦系統效能提升，電腦系統透過中央處理器做為電腦系統控管及程式執行與運算主要元件，改進處理器架構將大幅改善電腦系統執行時間。為加速處理器執行速度有兩種常見的可行方案:(1)處理器架構改良:IC製程對增進硬體效能影響力趨緩的情況，透過處理器架構改良能有效降低程式執行時間。(2)研發高效能互連網路:以高效能互連網路連接多顆處理器成為多核心系統，並以大量處理器平行執行科學運算中的子程式或相同運算以達到吞吐量(Throughput)提升。為此，本研究為提升單處理器效能，提出一個具備有DynaPack動態排程與封裝指令機制之新型VLIW(Very Long Instruction Word)處理器：Caliburn。DynaPack動態排程機制可解決傳統VLIW處理器VLIW指令與程式原生指令不相容問題，且有效提升處理器指令執行效能。當處理器架構改良後，設計一高效能互連網路則是本研究的另一目標，本論文提出兩個互連網路，分別是Self Similar Cubic及Bagua Network。Self Similar Cubic互連網路是以Cubic作為基本網路構成區塊，透過Self Similar方式擴充互連網路，此互連網路在單位硬體成本下，效能有不錯的表現。Bagua Network設計則著重於互連網路可擴充性，適用於處理器數量大的多核心系統。此外，為驗證本研究所提出的處理器與互連網路可行性，本研究採用ESL(Electronical System Level) 技術建立上述硬體建模，並依據ESL模型設計Verilog HDL硬體。Caliburn、Self Similar Cubic與Bagua Network實驗結果亦將於本論文之相關章節中討論。

關鍵字

互聯網路架構；超長指令集架構處理器；多核心電腦系統

並列摘要

Because of the continuous evolution of computer technology, people have relied on high-performance computer systems to perform increasingly complex scientific computations (e.g., cancer genome sequencing). Although research results relating to diseases should be obtained as soon as possible, relevant programs typically require a considerable amount of time to process abundant input data or complex mathematical calculations to yield the desired results. The processing time of these programs can only be reduced by increasing the overall performance of computer systems. Because computer systems depend on processors for system control and program execution and computations, improving the processor architecture can significantly reduce the processing time of computer systems. Two common methods for increasing the processing speed are: 1) Improving the architecture of the solo processor: The influence of integrated circuit fabrication processes on the hardware performance has declined; thus, enhancing processor architectures can significantly reduce program processing time. 2) Developing high-performance interconnection networks by integrating many processors: When high-performance interconnection networks are adopted to connect multiple processors to a multi-core system, parallel computing using multiple processors can increase the throughput of calculations or subroutines of scientific computations. To achieve the performance of a single processor, this dissertation proposes a novel very long instruction word (VLIW) processor, named Caliburn, capable of dynamic scheduling and instruction packing mechanisms. This dynamic instruction packing mechanism can effectively improve processor performance and solve the incompatibility between the VLIW instructions and native program instructions in traditional VLIW processors. Another objective of this dissertation is to design a high-performance interconnection network based on the improved processor architecture. The two proposed interconnection networks are the Self Similar Cubic (SSC) and Bagua Network (BN). The basic blocks that form the SSC interconnection network are cube, which extend interconnection networks using the self similar method. The proposed interconnection network exhibits a satisfactory performance per unit hardware cost. The design of Bagua Network focuses on the extendibility of interconnection networks appropriate for multi-core systems with numerous processors. In addition, to verify the feasibility of the proposed processor and interconnection networks, all hardware is molded using electronic system level design. The hardware feasibility has been verified by programming the Verilog hardware description language according to the specifications. Finally, the experimental results of Caliburn, SSC, and Bagua Network are discussed in the corresponding chapters.

並列關鍵字

multi-core computer system ； interconnection network ； VLIW processor

參考文獻

[1] G. M. Amdahl, “Validity of the Single-processor Approach to Achieving Large Scale Computing Capabilities, “AFIPS Conference Proceedings, vol. 30, pp. 483-485, 1967.

[3] G. E. Moore, “Cramming more Components onto Integrated Circuits,” Electronics, vol. 38, 1965.

[4] J. A. Fisher, “The VLIW Machine: A Multiprocessor for Compiling Scientific Code,” IEEE Transactions on Computers, vol. 17, no. 7, pp. 45-53, 1984.

[5] J. A. Fisher, “Very Long Instruction Word Architectures and the eli-512,” Proceedings of the 10th Annual International Symposium on Computer Architecture, pp. 140-150, 1983.

[6] A. Abnous and N. Bagherzadeh, “Architectural Design and Analysis of a VLIW Processor,” Computers & Electrical Engineering, vol. 21, no. 2, pp. 119-142, 1995.

國際替代計量

多核心電腦系統之架構探索

未授權

主題瀏覽