透過您的圖書館登入
IP:18.223.151.158
  • 學位論文

針對行動式圖形處理器的低耗能整合暫存器檔案之設計及其管理機制

An Unified Register File with Energy-Efficient Management Scheme for Mobile GPUs

指導教授 : 朱守禮
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


由於科技的發展,在行動式裝置上對於3D應用程式的需求也越來越大,而為了要達到即時繪圖的能力,圖形處理器也漸漸的在這些裝置上越來越普及。而在3D繪圖中的工作量極大且具有相當高度的資料平行度,故GPU通常以許多的著色器(shader)並配合高度的硬體多執行緒之能力來提高其吞吐量以符合人眼對於流暢畫面的需求。但也因為如此,其所需之多套的暫存器檔案會增加其耗能,使其在以電池驅動的行動式裝置上,降低電池的使用時間。根據過往研究,著色器為一GPU中耗電80%的單元,而此種暫存器檔案佔了其中約10%至20%的耗電,由此可知對其做能源管理的重要性。事實上,在一般3D應用程式中暫存器的使用量相當稀少,而大部份沒有使用到的暫存器就會造成耗能的浪費。為了減少在多執行緒著色器內的暫存器檔案的耗能浪費,本研究提出了一個unified register file的設計,並且搭載不同的電壓模式來降低其耗能浪費。除此之外,配合本研究提出之Adaptive Scheduling機制,可在效能與耗能間取得平衡,並配給合適的暫存器量來做多執行緒的執行。在不同3D遊戲上之實驗結果顯示,此unified register file可節省53.2%的耗能且損失的效能低於0.001%。若配合Adaptive Scheduling的機制可更進一步節省達86.6%的耗能,相較於相關研究之設只有節省了11.2%,此設計非常有效地降低了此類暫存器的耗能。

並列摘要


Due to the 3D applications in mobile devices raised in recently years, the GPUs are widely available on those devices. Since these workloads are highly data parallel with tremendous amount of data, the GPUs are usually multithreaded to provide high throughput to achieve real-time rendering. This increases its energy consumption due to duplicate register files for shaders, which takes 10% to 20% of energy in a GPU. However the register usages of shading programs are quite low, which causes lots of registers unused and thus waste energy. In order to reduce the energy consumptions of register file, this work proposed a unified register file with multiple power modes to overcome this problem. Furthermore an adaptive scheduling mechanism is provided as well to trade off between energy consumption and frame per second (FPS). The experiment results show that the unified register file saves 53.2% of energy with less than 0.001% performance overheads. Also the adaptive scheduling with automatic FPS saves 86.6% of energy while related work only saves 11.2% under real-world 3D games.

並列關鍵字

gpu register

參考文獻


[1] Ju-Ho Sohn; Jeong-Ho Woo; Min-Wuk Lee; Hye-Jung Kim; Woo, R.; Hoi-Jun Yoo; , "A 155-mW 50-m vertices/s graphics processor with fixed-point programmable vertex shader for mobile applications," Solid-State Circuits, IEEE Journal of , vol.41, no.5, pp. 1081- 1091, May 2006
[2] Chang-Hyo Yu; Kyusik Chung; Donghyun Kim; Lee-Sup Kim; , "An Energy-Efficient Mobile Vertex Processor With Multithread Expanded VLIW Architecture and Vertex Caches," Solid-State Circuits, IEEE Journal of , vol.42, no.10, pp.2257-2269, Oct. 2007
[3] Yu Chang-Hyo, Chung Kyusik, Kim Donghyun, Kim Seok-Hoon, and Kim Lee-Sup, “A 186-Mvertices/s 161-mW Floating-Point Vertex Processor With Optimized Datapath and Vertex Caches,” Very Large Scale Integration (VLSI) Systems, IEEE Transactions on , vol.17, no.10, pp.1369-1382, Oct. 2009
[5] Woo Jeong-Ho, Sohn Ju-Ho, Kim Hyejung, and Yoo Hoi-Jun, “A 195 mW, 9.1 MVertices/s Fully Programmable 3-D Graphics Processor for Low-Power Mobile Devices,” Solid-State Circuits, IEEE Journal of , vol.43, no.11, pp.2370-2380, Nov. 2008
[7] Jeong-Ho Woo; Ju-Ho Sohn; Hyejung Kim; Hoi-Jun Yoo; , "A 195 mW/152 mW Mobile Multimedia SoC With Fully Programmable 3-D Graphics and MPEG4/H.264/JPEG," Solid-State Circuits, IEEE Journal of , vol.43, no.9, pp.2047-2056, Sept. 2008

延伸閱讀