透過您的圖書館登入
IP:216.73.216.59
  • 學位論文

適用於特定神經網路任務之可調節映射平行度的模組化加速器

IP-Based Accelerator with Adjustable Mapping Parallelism Dataflow for Task-Specific DNN

指導教授 : 陳良基
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


AI在各個領域廣泛應用。為了應對具有數十億參數和快速演進架構的模型的複雜性,神經網絡模型和計算能力需要高度整合。儘管通用加速器使用複雜的網絡來適應模型變化,但特定任務的加速器提供了更好的解決方案。通過分析,我們發現神經網絡模型的變化是漸進和可預測的。我們提出了一種新的架構,將神經網絡模型劃分為具有相似計算特性的子集。通過將這些子集映射到優化的子加速器上,我們實現了計算能力和神經網絡模型之間的高度整合。我們的架構在Resnet50中相較於最先進的加速器,平均減少了32%的PE使用量和24%的能源消耗。對於像UNet這樣的影像分割模型,相較於最先進的加速器,我們提供了49%的PE使用量減少和39%的能源消耗減少。

並列摘要


AI is widely used in various domains. To handle the complexity of models with billions of parameters and rapidly evolving architectures, NN models and computational power need to be highly integrated. While general-purpose accelerators used complex networks to adapt to model variations, task-specific accelerators offer better solutions.Through analysis, we found that NN model variations are gradual and predictable. We propose a new architecture that divides NN models into subsets with similar computational characteristics. By mapping these subsets to optimized sub-accelerators, we achieve a high level of integration between computational power and NN models.Our architecture reduces PE usage by an average of 32% and energy costs by 24% compared to state-of-the-art accelerators in Resnet50. For models like UNet, we provide a 49% decrease in PE usage and a 39% decrease in energy costs compared to state-of-the-art accelerators.

參考文獻


KWON, Hyoukjun, et al. Heterogeneous dataflow accelerators for multi-DNN workloads. In: 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 2021. p. 71-83.
CHEN, Yu-Hsin, et al. Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2019, 9.2: 292-308.
QIN, Eric, et al. Sigma: A sparse and irregular gemm accelerator with flexible interconnects for dnn training. In: 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 2020. p. 58-70.
KWON, Hyoukjun; SAMAJDAR, Ananda; KRISHNA, Tushar. Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects. ACM SIGPLAN Notices, 2018, 53.2: 461-475.
HE, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. p. 770-778.

延伸閱讀