透過您的圖書館登入
IP:3.128.190.102
  • 學位論文

在分散式系統中對於隨機梯度下降演算法的適應性溝通模式

Adaptive Communication for Stochastic Gradient Descent in Distributed Deep Learning Systems

指導教授 : 劉邦鋒
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


分散式深度學習在發展人工智慧系統中扮演著重要的角色。如今,有研究提供了許多分散式學習演算法來加快訓練的過程,在這些演算法中,每台機器為了加速收斂必須經常交換梯度。但是,固定週期的梯度交換可能導致資料傳輸效率較差。在這篇論文中,我們提出了一個有效率的溝通方法,來提高隨機梯度下降演算法的效能。我們根據模型的變化來決定溝通的時機,當模型有巨量的變化得時候,我們將模型傳送給給其他的機器來計算新的平均結果。除此之外,我們動態地設置一個閾值來控制溝通週期。有了這種有效的溝通方法,我們可以減少溝通的傳輸量,從而提高效能。

並列摘要


Distributed deep learning plays an important role to develop human-intelligent computer system. Nowadays, studies have provided many distributed learning algorithms to speedup the training process. In these algorithms, the workers have to frequently exchange gradients for fast convergence. However, gradients exchange in a fixed period could cause inefficient data transmission. In this paper, we propose an efficient communication method to improve the performance of gossiping stochastic gradient descent algorithm. We decide the timing for communication according to the change of the local model. When the local model changes significantly, we push the models to other workers to calculate a new averaged result. Besides, we dynamically set a threshold for the communication period. With this efficient communication method, we can reduce communication overhead and thus improve the performance.

參考文獻


[1] J. Chen, R. Monga, S. Bengio, and R. Jozefowicz. Revisiting distributed synchronous
sgd. In International Conference on Learning Representations Workshop
Track, 2016.
[2] D. C. Ciresan, U. Meier, and J. Schmidhuber. Multi-column deep neural networks
for image classification. CoRR, abs/1202.2745, 2012.

延伸閱讀