透過您的圖書館登入
IP:3.141.27.244
  • 學位論文

解決探聽過濾器過時化問題的高效架構

An Efficient Architecture for Resolving the Aging Problem of Snoop Filter

指導教授 : 張世杰
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


快取一致性(Cache coherence)是指保留在快取記憶體(Cache)中的共享資源必須保持資料一致性的機制。其中,探聽式一致性協定由於其簡單的特性在多系統晶片應用非常普遍。快取控制器(Cache controller)會藉由對快取中快取塊(Cache line)相對應的標籤(Cache tag)進行快取標籤查詢(Cache tag lookup)來決定一筆資料是否存在快取中來回應每筆探聽式要求(Snoop request)。根據以往的研究表示,由於共享資源在不同的端點之間數量是有限的,約90%的探聽式要求是多餘的。這些多餘的要求會因為使快取控制器進行快取標籤查詢而浪費系統的能源。因此,探聽式過濾器(Snoop filter)就是被提出應用在篩選出無用的探聽式要求。探聽過濾器必須將所有快取讀取過的資料的位置(Address)壓縮進過濾器中。由於壓縮的特性,探聽式過濾器可能會做出錯誤的篩選又稱為假陽性(False positive)。所謂假陽性要求是指通過了過濾器並進入到快取中進行快取標籤查詢,才發現這是一筆多餘的要求。然而隨著時間,在過濾器中大量的壓縮資料會導致過濾器產生假陽性的篩選機率變高。所以一個低效率的過時化過濾器會導致許多浪費的標籤查詢。 為了解決低效率的過時化過濾器所導致的問題,IBM提出了一個使過濾器更新的方法,並提出更新的時機點為發生快取掩蓋時(Cache wrap)。如果發生快取掩蓋的時機點太長,過濾器就會開始降低效率,甚至在過濾器更新後不能達到更新的目的。我們發現在一些應用(SPLASH 2)中,快取掩蓋發生的時機點很長,同時過濾器產生假陽性的篩選機率會升高。因此在這篇論文中,我們專注在如何更新一個發生過時化的過濾器而不是在如何設計一個過濾器上。我們提出我們的過濾器復興技術 (Filter rejuvenation technique) 來解決低效率的過時化過濾器所導致的問題。

並列摘要


Snoop-based coherence protocol is very popular in multiprocessor systems because of its simplicity. In a snoop-based, many cache tag lookups are needed for snoop requests. However, it has been shown about 90% snoop requests are useless and therefore cache lookups are redundant. To reduce unnecessary cache lookups, the snoop filter scheme was proposed. However, it is known that the efficiency of a snoop filter decreases with time. In other words, an aging filter cannot filter out unnecessary requests. To solve the problem of an aging snoop filter, [8] has proposed a novel way to rejuvenate an aging snoop filter so that an aging filter can be refreshed to have high efficiency again. We observe that in several real designs, [8] fail to achieve effective rejuvenation. In this paper, we focus on how to rejuvenate a snoop filter design rather than to design the snoop filter itself. We propose a novel way of rejuvenating an aging snoop filter by four filter rejuvenation techniques. Our experimental results show that the proposed techniques, when works together, reduce the number of unnecessary requests to 62.23% and the energy consumption to 67.58% averagely. For the best case, we approximately reduce the number to 30% compared to [8].

參考文獻


[1] E. Atoofian and A. Baniasadi, “Using supplier locality in poweraware interconnects and caches in chip multiprocessors,” J. Systems Architecture 54(5): 507-518, 2008.
[2] E. Atoofian, A. Baniasadi and K. Aasaraai, “Speculative supplier identification for reducing power of interconnects in snoopy cache coherence protocols,” CF 2007: 259-266.
[3] M. Blumrich, V. Salapura and A. Gara, “Exploring the architecture of a stream register-based snoop filter,” 2011.
[4] A. Moshovos, “RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence,”
[5] A. Moshovos, G. Memik, B. Falsafi and A. Choudhary, “JETTY Filtering Snoops for Reduced Energy Consumption in SMP Servers,” HPCA, 2001.

延伸閱讀