單晶片多處理器已經成為目前處理器晶片的基準,追蹤檔驅動的模擬器因為其對 於架構設計空間探鎖的速度較快,在設計單晶片多處理器系統上此技術被廣泛的 使用。由於身邊擁有高達64 核的平行多核心電腦,像是特亞拉的特亞拉64,平 行處理追蹤驅動模擬以達成更快速的架構設計探鎖也變得不再困難。但是現今探 討平行處理追蹤驅動模擬的論文確非常的少。 此篇論文探討的是以晶片網路為基底的多核心晶片上快取一致性之平行追蹤驅 動模擬器的設計和實作,此模擬器我們將其取名為特亞模擬器+。特亞模擬器+ 藉由提供了精準度為周期的晶片網路模組和精準度為周期數的快取模擬模組不 僅可對記憶體存取時間做精確的評估還可對基於晶片網路上的單晶片多處理器 上快取設計空間做探索。而最重要的是,特亞模擬器+特亞拉的特亞拉64加速 下,在不失去擴充性的狀況下提升了追蹤驅動模擬的速度。在特亞拉上完成的對 特亞模擬器+的實驗評估顯示了此模擬器不僅可為特定的測試程式產生正確的 模擬結果也可達成對比於順序模擬器下良好的速度提升。我們也會在論文中教讀 者如何藉由特亞模擬器+來對快取設計空間的評估。
Chip Multiprocessor(CMP) is becoming the norm of processor chips. To design CMP, tracedriven simulation has been a commonly used technique for fast exploration of architecture design space. With the availability of parallel computers, such as Tilera’s Tile64, parallel trace-driven simulation for faster architecture evaluation is becoming possible. However, there are very few papers discussing parallel trace-driven simulation. This thesis discusses the design and implementation of a parallel trace-driven simulator for NoC-based cache coherence CMP named TileSim+, TileSim+ provides cycle-accurate network model and cycle-count accurate cache simulation model, which allows the precise evaluation of memory access delay but exploration of cache design space for NoC-based CMP. Most importantly, accelerated with machine such as Tilera’s Tile64, TileSim+ speeds up trace-driven simulation with good scalability. The experimental evaluation of TileSim+ on TILE64 shows that it can obtain correct simulation results for the tested benchmark programs and achieve good speedup over sequential simulator. We also demonstrate how to use TileSim+ to evaluate CMP cache designs.