We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.OC

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Mathematics > Optimization and Control

Title: A Multi-Batch L-BFGS Method for Machine Learning

Abstract: The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This inherently gives the algorithm a stochastic flavor that can cause instability in L-BFGS, a popular batch method in machine learning. These difficulties arise because L-BFGS employs gradient differences to update the Hessian approximations; when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.
Comments: 32 pages, 22 figures
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:1605.06049 [math.OC]
  (or arXiv:1605.06049v1 [math.OC] for this version)

Submission history

From: Albert Berahas [view email]
[v1] Thu, 19 May 2016 16:53:50 GMT (1382kb,D)
[v2] Sun, 23 Oct 2016 22:48:01 GMT (1390kb,D)

Link back to: arXiv, form interface, contact.