在分散式系統中對於隨機梯度下降演算法的適應性溝通模式

分散式深度學習在發展人工智慧系統中扮演著重要的角色。如今,有研究提供了許多分散式學習演算法來加快訓練的過程,在這些演算法中,每台機器為了加速收斂必須經常交換梯度。但是,固定週期的梯度交換可能導致資料傳輸效率較差。在這篇論文中,我們提出了一個有效率的溝通方法,來提高隨機梯度下降演算法的效能。我們根據模型的變化來決定溝通的時機,當模型有巨量的變化得時候,我們將模型傳送給給其他的機器來計算新的平均結果。除此之外,我們動態地設置一個閾值來控制溝通週期。有了這種有效的溝通方法,我們可以減少溝通的傳輸量,從而提高效能。

關鍵字

深度學習；分散式學習；隨機梯度下降演算法

並列摘要

Distributed deep learning plays an important role to develop human-intelligent computer system. Nowadays, studies have provided many distributed learning algorithms to speedup the training process. In these algorithms, the workers have to frequently exchange gradients for fast convergence. However, gradients exchange in a fixed period could cause inefficient data transmission. In this paper, we propose an efficient communication method to improve the performance of gossiping stochastic gradient descent algorithm. We decide the timing for communication according to the change of the local model. When the local model changes significantly, we push the models to other workers to calculate a new averaged result. Besides, we dynamically set a threshold for the communication period. With this efficient communication method, we can reduce communication overhead and thus improve the performance.