Current browse context:
math.OC
Change to browse by:
References & Citations
Mathematics > Optimization and Control
Title: On the Influence of Momentum Acceleration on Online Learning
(Submitted on 14 Mar 2016 (v1), last revised 12 Oct 2016 (this version, v4))
Abstract: The article examines in some detail the convergence rate and mean-square-error performance of momentum stochastic gradient methods in the constant step-size and slow adaptation regime. The results establish that momentum methods are equivalent to the standard stochastic gradient method with a re-scaled (larger) step-size value. The size of the re-scaling is determined by the value of the momentum parameter. The equivalence result is established for all time instants and not only in steady-state. The analysis is carried out for general strongly convex and smooth risk functions, and is not limited to quadratic risks. One notable conclusion is that the well-known bene ts of momentum constructions for deterministic optimization problems do not necessarily carry over to the adaptive online setting when small constant step-sizes are used to enable continuous adaptation and learn- ing in the presence of persistent gradient noise. From simulations, the equivalence between momentum and standard stochastic gradient methods is also observed for non-differentiable and non-convex problems.
Submission history
From: Kun Yuan [view email][v1] Mon, 14 Mar 2016 05:05:54 GMT (2598kb)
[v2] Tue, 29 Mar 2016 06:27:47 GMT (2598kb)
[v3] Mon, 1 Aug 2016 23:18:27 GMT (318kb)
[v4] Wed, 12 Oct 2016 05:19:07 GMT (551kb)
Link back to: arXiv, form interface, contact.