覆晶式設計的以效能為導向之區塊與輸入/輸出緩衝器擺置

覆晶式(flip-chip)封裝提供了最高密度的封裝方式，來支援受pad限制的ASIC設計。覆晶式設計其中一個最重要的特性是:輸入/輸出緩衝器(input/output buffer)可以被擺置在晶片中任意的位置。對大多數實際的設計而言，我們必須控制輸入/輸出信號的時序以確保電路操作的正確性。這可以藉由控制區塊(block)、輸入/輸出緩衝器與第一級/最後級cell在覆晶之間的相對位置來達成。此外，我們想要做最小化區塊與bump ball之間的路徑長度及不同路徑之間的延遲差異(delay skew)的同步最佳化。在本篇論文之中，我們針對區塊與輸入/輸出緩衝器在覆晶中的擺置(placement)問題提出了一個兩階段的擺置方法。在第一階段，我們運用simulated annealing搭配B*-tree表示法來對區塊與緩衝器作初期的擺置並最小化區塊與bump ball之間的最大連線長度。在第二階段，我們使用反覆的演算法來改善初期的擺置。藉由尋找每一個輸入/輸出緩衝器的零延遲位置(zero-skew position)，我們可以最小化每一個輸入/輸出緩衝器的信號延遲與最大的輸入/輸出信號延遲之間的差異。當所有緩衝器的信號延遲與最大信號延遲之間的差異皆小於一個使用者自訂的信號延遲範圍，我們便終止這個反覆改善的演算法。相較於單獨使用B*-tree的擺置與[16]所得到的擺置，我們的演算法得到較好的擺置結果。就cost function而言，單獨使用B*-tree擺置([16]所得到的擺置)的cost平均為我們的方法的32.23(14.08)倍。就執行時間而言，單獨使用B*-tree擺置([16]所得到的擺置)的執行時間為我們的方法的15.34(10.47)倍。藉由設定適當的信號延遲範圍，我們甚至可以達到完全零信號延遲的區塊與輸入/輸出緩衝器擺置。

關鍵字

覆晶；輸入/輸出；緩衝器；擺置

並列摘要

The flip-chip package gives the highest chip density of any packaging methods to support the pad-limited ASIC design. One of the most important characteristics of flip chip designs is that the input/output buffers could be placed anywhere inside a chip. For most practical designs, we have to control the timing of the input/output signals. This can be achieved through controlling the positions of bump balls, input/output buffers, and first-stage/last-stage cells in a flip chip. Specifically, we intend to minimize the path length between blocks and bump balls as well as the delay skew of the paths. In this thesis, we propose a two-stage placement method for the block and input/output buffer placement in flip-chip design. In the first stage, we apply simulated annealing using the B*-tree representation to minimize the maximum wirelength and obtain an initial feasible placement. In the second stage, we apply an iterative algorithm to improve the initial solution. In each iteration, we find the zero-skew position for each buffer to minimize the signal delay skew between the buffer and one with the maximum signal delay. The iterative improvement terminates when all of the signal delay skews of input/output buffers are under an user-specified range. Compared with the placement using the B*-tree alone and the work in [16], our method obtains significantly better results. The B*-tree based algorithm ([16]) results in overall cost of 32.23 times (14.08 times) of that of our algorithm. In terms of running time, the B*-tree based algorithm ([16]) needs 15.34 times (10.47 times) of our CPU time. In particular, setting an appropriate grid size and a signal skew range, we can even get a placement with zero signal skews for all input/output buffers.

並列關鍵字

placement ； input/output ； buffer ； I/O ； flip-chip

參考文獻

[1] P.H. Buffet, J. Natonio, R.A. Proctor, Yu H. Sun, G. Yasar, “Methodology for I/O Cell Placement and Checking in ASIC Designs Using Area-Array Power Grid,”Proc. of IEEE Custom Integrated Circuits Conf., pp. 125–128, 2000.

[2] M. A. Breuer., “A class of min-cut placement Algorithm,” Proc. of ACM/IEEE Design Automation Conf., pp. 284–290, 1977.

[5] P. Dehkordi and D. Bouldin, “Design for ckageability: The Impact of Bonding Technology on the Size and Layout of VLSI Dies,” Proc. of Multichip Module Conf., pp. 153–159, 1993.

[6] W. C. Elmore., “The transient response of damped linear network with particular regard to wideband amplifiers,” J. Applied Physics, pp. 55–63, 1948.

[8] P.-N. Guo, C.-K. Cheng, and T. Yoshimura, “An O-Tree Representation of Non-Slicing Floorplan and Its Applications,” Proc. of ACM/IEEE Design Automation

國際替代計量

覆晶式設計的以效能為導向之區塊與輸入/輸出緩衝器擺置

全文下載

主題瀏覽