Asynchrony begets Momentum, with an Application to Deep Learning

Mitliagkas, Ioannis; Zhang, Ce; Hadjis, Stefan; Ré, Christopher

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 1605

Statistics > Machine Learning

Title: Asynchrony begets Momentum, with an Application to Deep Learning

Authors: Ioannis Mitliagkas, Ce Zhang, Stefan Hadjis, Christopher Ré

(Submitted on 31 May 2016 (this version), latest version 25 Nov 2016 (v2))

Abstract: Asynchronous methods are widely used in deep learning, but have limited theoretical justification when applied to non-convex problems. We give a simple argument that running stochastic gradient descent (SGD) in an asynchronous manner can be viewed as adding a momentum-like term to the SGD iteration. Our result does not assume convexity of the objective function, so is applicable to deep learning systems. We observe that a standard queuing model of asynchrony results in a form of momentum that is commonly used by deep learning practitioners. This forges a link between queuing theory and asynchrony in deep learning systems, which could be useful for systems builders. For convolutional neural networks, we experimentally validate that the degree of asynchrony directly correlates with the momentum, confirming our main result. Since asynchrony has better hardware efficiency, this result may shed light on when asynchronous execution is more efficient for deep learning systems.

Comments:	7 pages, 5 figures
Subjects:	Machine Learning (stat.ML); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:1605.09774 [stat.ML]
	(or arXiv:1605.09774v1 [stat.ML] for this version)

Submission history

From: Ioannis Mitliagkas [view email]
[v1] Tue, 31 May 2016 19:16:56 GMT (586kb,D)
[v2] Fri, 25 Nov 2016 12:00:28 GMT (1696kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1605.09774v1

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Asynchrony begets Momentum, with an Application to Deep Learning

Submission history