透過您的圖書館登入
IP:3.139.105.83
  • 學位論文

適用於二維格狀多處理器晶片系統之可容錯晶片內網路架構

Fault-tolerant On-chip Network Architecture for 2D-mesh Based Chip Multiprocessor Systems

指導教授 : 吳安宇

摘要


本論文中,為了提高晶片內網路之可容錯性並降低其在容錯情況下的效能損失,我們提出兩種晶片內網路架構:1) 20-path router with BIST/SD/FI (20PR):內建自我測試/診斷/錯誤隔離電路的路由器設計。2) Surrounding Test Ring (STR),一個由外部對晶片內網路進行測試與診斷的架構。它們除了具有自我測試/診斷(Built-in Self-Test/Self-Diagnosis)和錯誤隔離(Fault-Isolation)的功能以外,還可以使用路由器中未損壞的部份以降低容錯情況下的效能損失,如此的架構可以讓系統運用其特性重新分配工作到無錯誤的路徑上以維持系統的正常運作。 在我們的實驗中,20PR內建的自我測試診斷電路可以在117個週期時間中測試完畢,而STR可在144~376個週期中測試完畢。使用20PR的晶片內網路須付出15.17%的額外硬體成本,而使用STR的則需付出8.48%~13.3%。而在效能的方面,在我們的實驗中,與傳統將整個錯誤路由器關閉的作法,需重新配置的封包在20PR中降低了75.68%~83.29%,而在STR中降低了68.33%~79.31%。而系統的延遲在20PR中降低了7.25%~24.57%,在STR中則降低了4.86%~23.6%。實驗的結果呈現出來我們提出的可容錯晶片內網路架構可以有效的減少錯誤晶片內網路的效能損失。

並列摘要


In this thesis, to improve fault-tolerance and reduce performance degradation in faulty on-chip networks, two on-chip network (OCN) architectures are proposed: 1) 20-path router (20PR), a router embedded with Built-in Self-Test/Self-Diagnosis (BIST/BISD) and Fault-Isolation (FI) circuits. 2) Surrounding Test Ring (STR), an external test architecture which externally perform test and diagnosis of the on-chip network. They embed BIST/SD and FI circuits that detect, locate, and isolate the impacts of the faulty FIFOs and MUXs in the faulty routers. Moreover, 20PR and STR apply undamaged datapaths in faulty routers to reduce performance degradation. The operation system can remap the tasks onto undamaged datapaths the proposed architectures found to maintain system function. In our experiments, the BIST/SD of the 20PR can be executed in 117 constant test cycles and the STR can be executed in 144~376 test cycles. The overhead of the OCN using 20PRs increases 15.17%, while the OCNs with STRs increase 8.48%~13.3%. The experiments also show the performance improved over prior approaches which completely disable faulty routers. The remapped packets are reduced by 75.68%~83.29% for 20PR and 68.33%~79.31% for STR comparing to traditional approaches. The system latencies are also reduced by 7.25%~24.57% for 20PR and 4.86%~23.6% for STR comparing to traditional approaches. The experiment shows proposed fault-tolerant OCN architectures can perform graceful degradation in faulty mesh OCNs.

參考文獻


[1] M. Pirretti, G. M. Link, R. R. Brooks, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin, “Fault Tolerant Algorithms for Network-On-Chip Interconnect,” in Proceedings of IEEE Computer society Annual Symposium on VLSI, 2004, pp. 46-51, Feb 19-20, 2004.
[3] Zhen Zhang, Alain Greiner, and Sami Taktak, “A Reconfigurable Routing Algorithm for a Fault-Tolerant 2D-Mesh Network-on-Chip,” in 45th ACM/IEEE Design Automation Conference, 2008 (DAC 2008), pp. 441-446, June 2008.
[4] J Kim, C Nicopoulos, and D Park “A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks,” in 33rd International Symposium on Computer Architecture, 2006. (ISCA '06), pp. 4-15, July 2006.
[7] B. Ahmad, Ahmet T. Erdogan, and Sami Khawam, “Architecture of a Dynamically Reconfigurable NoC for Adaptive Reconfigurable MPSoC,” in First NASA/ESA Conference on Adaptive Hardware and Systems, 2006 (AHS 2006), pp.405-411, June 15-18, 2006.
[8] C. Hilton and B. Nelson, “PNoC: a flexible circuit-switched NoC for FPGA-based systems”, in IEE Proceedings of Computers and Digital Techniques, Vol. 153, Issue. 3, pp. 181-188, May 2006.

延伸閱讀