透過您的圖書館登入
IP:52.15.59.163
  • 學位論文

建立使用非同步隨機梯度下降法的分散式訓練之多參數伺服器模型

Multi-parameter-server modeling for distributed asynchronous SGD

指導教授 : 周承復

摘要


深度神經網路最近在各領域獲得了巨大的成功,並吸引了更多世界各地學者的目光。大量的訓練工作考驗著軟硬體的發展。分散式學習是一種常見的加速方式。在這篇論文中我們會提出解決擴展學習環境的其中一個問題,也會解釋整個模型與背後使用的工具。

並列摘要


Deep Neural Networks(DNNs) is very successful and has drawn more and more attentions from researchers all over the world. A huge demand of training jobs are challenging the development of both software tools and hardware systems. Distributed training is a common approach to speed up these jobs. In this paper, we propose a new method to address one of the problem in expanding the scale of your training environment, and we will also explain the model and tools behind.

參考文獻


[1] M. Abadi and al. Tensorflow: Largescale machine learning on heterogeneous systems.
[2] C. M. Bishop. Pattern recognition and machine learning. springer, 2006.
[3] L. Bottou. Largescale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT’2010, pages 177–186, 2010.
[4] L. Bottou. Stochastic gradient descent tricks. In Neural Networks: Tricks of the Trade Second Edition, pages 421–436, 2012.
[5] C. Brezinski and M. R. Zaglia. Extrapolation methods: theory and practice, volume 2. Elsevier, 2013.

延伸閱讀