We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: On the fast convergence of minibatch heavy ball momentum

Abstract: Simple stochastic momentum methods are widely used in machine learning optimization, but their good practical performance is at odds with an absence of theoretical guarantees of acceleration in the literature. In this work, we aim to close the gap between theory and practice by showing that stochastic heavy ball momentum, which can be interpreted as a randomized Kaczmarz algorithm with momentum, retains the fast linear rate of (deterministic) heavy ball momentum on quadratic optimization problems, at least when minibatching with a sufficiently large batch size is used. The analysis relies on carefully decomposing the momentum transition matrix, and using new spectral norm concentration bounds for products of independent random matrices. We provide numerical experiments to demonstrate that our bounds are reasonably sharp.
Subjects: Machine Learning (cs.LG); Data Structures and Algorithms (cs.DS); Numerical Analysis (math.NA); Optimization and Control (math.OC); Machine Learning (stat.ML)
MSC classes: 65K05, 90C06, 90C30, 65F10, 68W20
Cite as: arXiv:2206.07553 [cs.LG]
  (or arXiv:2206.07553v2 [cs.LG] for this version)

Submission history

From: Tyler Chen [view email]
[v1] Wed, 15 Jun 2022 14:12:45 GMT (1074kb,D)
[v2] Thu, 28 Jul 2022 15:25:49 GMT (1083kb,D)

Link back to: arXiv, form interface, contact.