增加指令窗的尺寸和維持高頻率的時脈相衝突。而run-ahead execution可以發掘更高的記憶體平行度。然而,因為在大指令窗維持指令相依關係的困難度,在run-ahead狀態之下所產生的執行結果被浪費了。我們提出一個大的指令窗設計,相較於傳統亂序執行處理器,只使用簡單的結構和管理以維持高時脈。主要的概念為,當遇到一個長快取失誤指令,指令串以及部分的執行結果被暫時地移到一個大容量且快速的保留指令佇列。因此,指令窗可以在發生長快取失誤的同時繼續發掘未來的指令平行度。 實驗結果指出,在一個有一千個欄位的保留指令佇列的四路處理器下,相對於原本的run-ahead設計,對於SPEC INT2000以及SPEC FP2000分別有5%與10%的加速效果。
The confliction between increasing the instruction window size and keeping the clock cycle time small is getting worse. The run-ahead execution eases this problem by exploring higher memory level parallelism (MLP). However, the execution results produced in the run-ahead state are wasted due to the difficulty to maintain dependency across large instruction window. We propose a large instruction window design with the cycle time of simple structures and easy management. The main idea is, while a long latency load miss happens, the instructions with execution results are sequentially driven into a large and fast preserving buffer. With a 1K-entry preserving buffer, the experimental results show that a 4-way processor with our design can achieve speedups of 5% and 10% over the original run-ahead execution design for SPEC INT2000 and SPEC FP2000.