透過您的圖書館登入
IP:18.118.122.46
  • 學位論文

耐用晶片設計-對於先進製程設計成本與可靠度之方法研究

Live Free or Die Hard - Leveraging Design Cost and Reliability for Modern Design Techniques

指導教授 : 張世杰

摘要


隨著製程的進步,現今的晶片設計往往能達到體積小、功能複雜且效能優越等特性,且被廣泛地運用於可攜帶式裝置如遠端感測器、智慧可攜式裝置等。然而,考量其有限的電源供應,低功耗設計成為實現此類應用不可或缺的技術。另一方面,為達到低功耗設計,晶片的可靠度往往成為被犧牲的對象,造成的潛在的可靠度問題。因此,適當的槓桿低功耗及可靠度成為一個重要的探討課題。在此篇論文中,我們針對三種常見的低功耗晶片設計方法—電源閘控、三維晶片、以及動態電壓/頻率調變加以探討,指出這些設計方法可能造成的潛在性可靠度問題,並提出創新且有效率的方法來槓桿設計成本與晶片可靠度,以期在不增加太多成本的前提下,大幅度提升晶片的可靠度。 第一部分,我們針對電源閘控設計中暫存器資料流失的問題提出探討。在過去的做法中,為確保暫存器資料不會因為電源閘控作用而消失,掃描鏈或回覆暫存器被廣泛的使用來保存暫存器資料。然而,前者將大大增長資料傳輸的時間,而後者則會增加約20%的設計面積。在此論文中,我們提出了多位元回復暫存器的概念與設計,以及布局多位元回復暫存器的方法。多位元回復暫存器結合了傳統兩種方法的好處,能達到減少傳輸時間且降低額外設計面積的好處,並能更進一步節省功耗。實驗結果顯示與傳統的單位元回復暫存器相比,我們平均能節省84%漏電流。 第二部分,我們針對現今三維晶片堆疊方法的可靠度問題提出深入的研究。三維晶片堆疊方法在提出後,因其能達到高速度與低功耗等優勢,近幾年被廣泛的研究與應用。其中,矽穿孔(TSV)為實現三維晶片不可或缺的元件,然而在現今的製程中,矽穿孔良率過低對三維晶片的可靠度造成非常大的威脅。因此,在此論文中,我們針對冗餘矽穿孔的布局進行研究,在考量良率與時間需求下,優化冗餘矽穿孔的數量與布局方式,來降低額外增加的面積。實驗結果顯示我們可以達到61%的額外面積節省率。 第三部分,在現今的低功耗設計中,動態電壓/頻率調變已經廣泛的被應用於晶片上,藉由不同的工作效能需求來動態調整工作電壓/頻率來節省功耗。然而,過度的降低電壓將會導致可靠度的問題,進而造成功能的錯誤。因此,如何動態決定最適當的工作電壓成為實現動態電壓/頻率調變很重要的一環。在此研究中,我們提出了兩種不同的方式,聯合機率分布(JPDF)以及Q-學習方法,來動態決定最佳工作電壓。實驗結果顯示,與傳統stepping-based 的方法比較,我們可以達到89.3%的功耗減少。

並列摘要


As CMOS technology continuous to scale down, Integrated circuits (ICs) have been wieldy applied to various applications such as remote sensors and smart portable devices due to the benefits of smaller area, higher complexity, and high performance. On the other hand, the limited power budget makes low power design as an indispensable part. However, the existing low power design schemes may cause reliability issues and threaten the integrity of the ICs, which comes up with an open space for further investigation. In this dissertation, we address three major reliability challenges for modern low power IC designs, i.e., power gating design, 3D-ICs, and dynamic voltage/frequency scaling, and propose corresponding efficient and intelligent methods to address the problems to gracefully trade-off between design cost and reliability. First, retention registers have been widely used in power gated designs to store data during sleep mode. However, their excessive area and leakage power render it imperative to minimize the total retention storage size. The current industry practice replaces all registers with single-bit retention ones, which significantly limits the design freedom and yields sub-optimal designs. Towards this, for the first time in literature, we propose the concept and the design of multi-bit retention registers, with which only selected registers need to be replaced. The technique can significantly reduce the number of bits that need to be stored and thus the leakage power, but needs several clock cycles for mode transition. In addition, an efficient assignment algorithm is developed to minimize the total retention storage size subject to mode transition latency constraint. Experimental results show that our framework on average can reduce the leakage power in sleep mode by 84% along with additional mode transition latency of 6 to 11 clock cycles, compared with the single-bit retention register based design. Second, in three-dimensional integrated circuits (3D ICs), Through Silicon Via (TSV) is a critical enabling technique to provide vertical connections. However, it may suffer from many reliability issues such as undercut, misalignment or random open defects. Various fault-tolerance mechanisms have been proposed in literature to improve yield, at the cost of significant area overhead. In this part, we focus on the structure that uses one spare TSV for a group of original TSVs, and study the optimal assignment of spare TSVs under yield and timing constraints to minimize the total area overhead. We show that such problem can be modeled as a constrained graph decomposition problem. Two efficient heuristics are further developed to address this problem. Experimental results show that under the same yield and timing constraints, our heuristic can reduce the area overhead induced by the fault-tolerance mechanisms by up to 61%, compared with a seemingly more intuitive nearest-neighbor based heuristic. Third, Dynamic voltage scaling (DVS) has been widely used to suppress power consumption in modern designs. The decision of optimal operating voltage at runtime should consider the variations in workload, process as well as environment. As these variations are hard to predict accurately at design time, various deterministic and reinforcement learning based DVS schemes have been proposed in the literature. However, none of them can be readily applied to designs with graceful degradation, where timing errors are allowed with bounded probability to trade for further power reduction. In this part, we propose JPDF based and Q-learning based DVS scheme dedicated to the designs with graceful degradation. We compare it with deterministic DVS schemes, i.e., a stepping based scheme. Experimental results on three 45nm industrial designs show that the proposed Q-learning based scheme can achieve up to 83.9% power reduction with 0.01 timing error probability bound. To the best of the authors’ knowledge, this is the first in-depth work to explore reinforcement learning based DVS schemes for designs with graceful degradation.

參考文獻


[3] P. Ashar, and S. Malik, “Implicit Computation of Minimum-Cost Feedback-Vertex Sets for Partial Scan and Other Applications”, in Proc. Design Automation Conference (DAC), pp77-80, 1994.
[6] Y. -G Chen, Y. Shi, K. -Y Lai, H. Geng and S. -C Chang, "Efficient multiple-bit retention register assignment for power gated design: Concept and algorithms", in Proc. of International Conference on Computer-Aided Design (ICCAD), pp.309-316, Nov.2012.
[8] Y.-G. Chen, K.-Y Lai, M.-C Lee, Y. Shi, W.-K. Hon, S.-C Chang, “Yield and timing constrained spare TSV assignment for three-dimensional integrated circuits,” in Proc. of Automation & Test in Europe Conference & Exhibition (DATE), pp.1-4, 2014.
[10] Y. -G. Chen, T. Wang, K.Y. Lai, W.Y. Wen, Y. Shi, and S.C. Chang, “Critical path monitor enabled dynamic voltage scaling for graceful degradation in sub-threshold designs,” in Proc. of ACM/EDAC/IEEE Design Automation Conference (DAC), pp.1-6, June 2014.
[11] Y.-G. Chen, W.-Y. Wen, T. Wang, Y. Shi, S.-C. Chang, “Q-Learning Based Dynamic Voltage Scaling for Designs with Graceful Degradation,” in Proc. of International Symposium on Physical Design (ISPD), pp.41-48, 2015.

延伸閱讀