在異質架構下深度學習利用多為數組中的稀疏度做動態調整

Heterogeneous computing achieves high performance by exploiting high parallelism and special type of computation (such as SIMD operations) available in applications on best fit computation devices. For example, massive and regular SIMD operations can be more efficiently computed on GPU. However, the performance of heterogeneous program can be degraded when the portion assigned to GPU encounters irregular tasks. Deep learning is an application that has the characteristics of high parallelism but may also encounter irregular tasks. This study introduces a method which could reduce computation and improve the performance in the deep learning application by recording information on the runtime. By using collected information, we can adaptive changing workload when we encounter irregular tasks. When deep learning encounters irregular tasks, using our method could split the workload of the deep learning application into two parts: dense workload and sparse workload. The dense workload will be deployed on the GPU device, and the sparse part is sent for the CPU. In this way, GPU gets better computing efficiency, and CPU is more competent in handling sparse part than GPU.

並列關鍵字

Adaptive runtime ； deep learning ； heterogeneous systems

參考文獻

[1] MIT Technology Review, “Deep learning” [online]. Available:

[2] CS231n Convolutional Neural Networks for Visual Recognition [online] Available:

[3] Xavier Glorot, A. B., Yoshua Bengio (2011). "Deep Sparse Rectifier Neural Networks."

[7] AMD, Heterogeneous Computing, What is Heterogeneous System Architecture (HSA) , [online] Available:

[8] GOOGLE, Tensorflow, [online] Available:

國際替代計量

在異質架構下深度學習利用多為數組中的稀疏度做動態調整

全文下載

主題瀏覽