Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

Dalibard, Valentin; Schaarschmidt, Michael; Yoneki, Eiko

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 1612

Statistics > Machine Learning

Title: Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

Authors: Valentin Dalibard, Michael Schaarschmidt, Eiko Yoneki

(Submitted on 1 Dec 2016)

Abstract: We present an optimizer which uses Bayesian optimization to tune the system parameters of distributed stochastic gradient descent (SGD). Given a specific context, our goal is to quickly find efficient configurations which appropriately balance the load between the available machines to minimize the average SGD iteration time. Our experiments consider setups with over thirty parameters. Traditional Bayesian optimization, which uses a Gaussian process as its model, is not well suited to such high dimensional domains. To reduce convergence time, we exploit the available structure. We design a probabilistic model which simulates the behavior of distributed SGD and use it within Bayesian optimization. Our model can exploit many runtime measurements for inference per evaluation of the objective function. Our experiments show that our resulting optimizer converges to efficient configurations within ten iterations, the optimized configurations outperform those found by generic optimizer in thirty iterations by up to 2X.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1612.00383 [stat.ML]
	(or arXiv:1612.00383v1 [stat.ML] for this version)

Submission history

From: Valentin Dalibard [view email]
[v1] Thu, 1 Dec 2016 19:08:12 GMT (36kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1612.00383

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

Submission history