透過您的圖書館登入
IP:18.191.157.186
  • 學位論文

植基於現場可程式化邏輯閘陣列晶片之高效重置乘法器與集成型記憶體分配架構

Effective Reconfigurable Multiplication and Integrated Memory Distribution on FPGAs

指導教授 : 陳銘憲
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


數位訊號處理(DSP)在現今電子系統如行動通信、自駕車控制單元等扮演著重要的角色。其中廣泛使用的計算作業為多常數乘法(MCMs),使用的情境包含有數位濾波器與離散轉換。在特定應用積體電路(ASIC)廣泛運用的同時,有越來越多的高性能數位訊號處理系被大量實作於現場可程式化邏輯閘陣列晶片(FPGAs)中,這使得許多研究專注在簡化多常數乘法器的複雜度。然而因現場可程式化邏輯閘陣列晶片本身計算資源的限制,當這些數位訊號處理應用程式在使用多常數乘法作業時,需要頻繁地重置加速器內部的乘法係數,然而至今仍沒有相關的研究專注在一致性硬體拓撲的前提下發展多常數乘法器,來避免耗時的重置時間。此外,這些數位訊號處理系統通常需要可提供高度平行化的記憶體架構來保證它們對資料存取的一致性,雖然現今的現場可程式化邏輯閘陣列晶片提供了雙埠記憶體區塊,但若將這些記憶體區塊實作成多埠記憶體來提供平行化的存取,將會造成大量的記憶體消耗。 在本論文中,我們針對需要頻繁重置乘法器區塊的應用程式提出了一個名為一元化多常數乘法(UMCM)的問題,並同時提出了一個名為兼容性之圖合成的平台,來有效率地建構一元化多常數乘法器。在該平台中,僅靠一組對數位移器之參數集,即可快速地動態重置乘法器區塊中的係數,避免使用現場可程式化邏輯閘陣列晶片所提供的重置技術,而增加系統的回應時間。根據實驗結果顯示,我們所提出的一元化多常數乘法技術非常適用於因硬體計算資源限制而需要頻繁重置乘法器的數位訊號處理應用程式。 此外,為了確保這些數位訊號處理系統擁有對資料存取的一致性,在本論文中,我們同樣提出了一種可支援多埠切換的平行化記憶體存取架構,稱之為IMPC,來解決大量的記憶體與邏輯單元消耗的問題。該架構預先定義好一組軟/硬式存取埠的記憶體單元,再利用最小集合配置(MSP)問題來尋找一組使用最少記憶體單元的組合,來達到上述目的,根據實驗結果顯示,我們所提出的可支援多埠切換的記憶體存取架構可有效降低記憶體與相關計算資源的消耗。

並列摘要


Digital signal processing (DSP) plays a significant role in nearly any modern electronic system used in mobile communication, automotive control units, biomedical applications and high energy physics, to name a few. A very frequent but resource-intensive operation in DSP related systems is the multiplication of a variable by several constants, commonly denoted as multiple constant multiplications (MCMs). It is needed, e. g., in digital filters and discrete transforms. While high-performance DSP systems were traditionally realized as application-specific integrated circuits (ASICs), there is an ongoing and increasing trend to use generic programmable logic ICs like field-programmable gate arrays (FPGAs). This results in a rich body of literature to deal with its complexity reduction on FPGAs. However, the limited silicon area on FPGAs requires applications such as image processing to frequently exchange the multiplier (filter) blocks in a series of filtering operations, and no previous works have considered the MCM problem under topological constraint to avoid time-consuming partial reconfiguration on FPGAs. Moreover, these DSP applications typically demand highly parallel memory structures to keep pace with their concurrent nature since memories are usually the bottleneck of computation performance. Most FPGA devices provide dual-ported SRAM blocks only, leading to a relatively large overhead in resource usage for multi-ported random access memory techniques in FPGAs. In this dissertation, we define a unified MCM (UMCM) problem of finding a unified hardware topology for the frequently exchanged multiplier blocks and introduce a framework termed compatible graph synthesis to solve the problem efficiently. Using a set of parameters to logarithmic shifters, the dynamic exchange of multiplier blocks with the unified topology can be realized without partial reconfiguration on FPGAs. Experimental results show that the solution is valuable for applications that require the frequent exchange of multiplier blocks on FPGAs due to a limited silicon area budget. While these DSP applications demand highly parallel memory structures to keep pace with their concurrent nature, developing a multi-ported memory distribution for simultaneous write/read transactions on FPGAs is essential. For this purpose, we also propose a new idea of having multiple hard/soft switched ports in a multi-ported RAM design, termed IMPC, to minimize the BRAM usage and the LUT consumption on FPGAs. This framework requires a set of predefined hybrid hard/soft ported memory instances to create a specific architecture design by solving a minimum set packing (MSP) problem to optimize its implementation. The experimental results and analysis illustrate that our IMPC is a better solution for the multi-ported memory configurations in terms of memory consumption and logic resources occupation on FPGAs.

並列關鍵字

FPGA MCM UMCM Multi-Ported Memory

參考文獻


[1] A. M. S. Abdelhadi and G. G. F. Lemieux. A multi-ported memory compiler utilizing true dual-port brams. In International Symposium on Field­Programmable Custom Computing Machines (FCCM), pages 140–147. IEEE, 2016.
[2] Ameer M. S. Abdelhadi. Architecture of block­RAM­based massively parallel mem­ ory structures: multi­ported memories and content­addressable memories. PhD thesis, University of British Columbia, Vancouver, 2016.
[3] Ameer M. S. Abdelhadi and Guy G. F. Lemieux. Modular switched multi-ported sram-based memories. ACM Transactions on Reconfigurable Technology and Sys­ tems, 9(3):1–26, 2016.
[4] L. Aksoy, E. O. Gunes, and P. Flores. An exact breadth-first search algorithm for the multiple constant multiplications problem. In NORCHIP, pages 41–46. IEEE, 2008.
[5] Levent Aksoy, Ece Olcay Güneş, and Paulo Flores. Search algorithms for the multiple constant multiplications problem: Exact and approximate. Microprocessors and Microsystems, 34(5):151–162, 2010.

延伸閱讀