在本篇論文中,我們提出了一種有效率的卷積類神經網路資料流優化的方法,此方法利用「資訊再使用距離」的方法來量化不同資料流的資料局部性,我們這個量化方法的獨特貢獻在於我們可以在早期系統設計階段時針對不同的記憶體架構去系統化地設計卷積運算的最佳資料流,此外,我們的方法可以應用在通用處理器或專用處理器上。通過跟特定處理器的最佳發布結果比較來驗證我們方法,我們的方法花費不到一秒鐘來找到最佳資料流,並且運行過程比先前的工作快了三到四個數量級。為了證明我們的方法具有通用性,實驗數據中也顯示我們的方法可以在其它由DineroIV模擬的其他記憶體架構中得出卷積運算的最佳資料流。
In this paper, we propose an efficient CNN dataflow optimization approach leveraging the reuse-distance method to quantify the data locality of different dataflows. A unique contribution of our quantitative approach is that we can systematically design the optimal dataflow of convolution computation for different memory architectures in the early system design phase. Additionally, our method can apply to general-purpose processors or customized processors. We verify our approach positively by comparing the best-published results of some specific processors. Our approach takes less than one second to find the optimal dataflow and runs 3 to 4 orders faster than previous works. To prove the versatility of our approach, the experimental results also show that our approach can produce the optimal dataflow of convolution computation on other memory architecture simulated by DineroIV.