以低複雜度延伸有效的指令窗以容忍資料讀取失誤延遲

增加指令窗的尺寸和維持高頻率的時脈相衝突。而run-ahead execution可以發掘更高的記憶體平行度。然而，因為在大指令窗維持指令相依關係的困難度，在run-ahead狀態之下所產生的執行結果被浪費了。我們提出一個大的指令窗設計，相較於傳統亂序執行處理器，只使用簡單的結構和管理以維持高時脈。主要的概念為，當遇到一個長快取失誤指令，指令串以及部分的執行結果被暫時地移到一個大容量且快速的保留指令佇列。因此，指令窗可以在發生長快取失誤的同時繼續發掘未來的指令平行度。實驗結果指出，在一個有一千個欄位的保留指令佇列的四路處理器下，相對於原本的run-ahead設計，對於SPEC INT2000以及SPEC FP2000分別有5％與10％的加速效果。

關鍵字

指令窗；延遲容忍；複雜度

並列摘要

The confliction between increasing the instruction window size and keeping the clock cycle time small is getting worse. The run-ahead execution eases this problem by exploring higher memory level parallelism (MLP). However, the execution results produced in the run-ahead state are wasted due to the difficulty to maintain dependency across large instruction window. We propose a large instruction window design with the cycle time of simple structures and easy management. The main idea is, while a long latency load miss happens, the instructions with execution results are sequentially driven into a large and fast preserving buffer. With a 1K-entry preserving buffer, the experimental results show that a 4-way processor with our design can achieve speedups of 5% and 10% over the original run-ahead execution design for SPEC INT2000 and SPEC FP2000.

並列關鍵字

instruction window ； latency tolerance ； complexity

參考文獻

[2] E. Riseman and C. Foster, “The inhibition of potential parallelism by conditional jumps,” Transactions on Computers, vol. C-21, no. 12, pp. 1405–1411, Dec. 1972.

[3] P.Michaud, A. Seznec, and S. Jourdan, “An exploration of instruction fetch requirement in out-of-order superscalar processors,” Int. J. Parallel Program., vol. 29, no. 1, pp. 35–58, 2001.

[5] P. Michaud and A. Seznec, “Data-flow prescheduling for large instruction windows in out-of-order processors,” in HPCA ’01: Proceedings of the 7th International Symposium on High-Performance Computer Architecture. Washington, DC, USA: IEEE Computer Society, 2001, p. 27.

[7] I. Kim and M. H. Lipasti, “Half-price architecture,” in ISCA ’03: Proceedings of the 30th annual international symposium on Computer architecture. New York, NY, USA: ACM Press, 2003, pp. 28–38.

[10] S. T. Srinivasan, R. Rajwar, H. Akkary, A. Gandhi, and M. Upton, “Continual flow pipelines,” in ASPLOS-XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems. New York, NY, USA: ACM Press, 2004, pp. 107–119.

國際替代計量

以低複雜度延伸有效的指令窗以容忍資料讀取失誤延遲

全文下載

主題瀏覽