We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.OC

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Mathematics > Optimization and Control

Title: Nys-Newton: Nyström-Approximated Curvature for Stochastic Optimization

Abstract: Second-order optimization methods are among the most widely used optimization approaches for convex optimization problems, and have recently been used to optimize non-convex optimization problems such as deep learning models. The widely used second-order optimization methods such as quasi-Newton methods generally provide curvature information by approximating the Hessian using the secant equation. However, the secant equation becomes insipid in approximating the Newton step owing to its use of the first-order derivatives. In this study, we propose an approximate Newton sketch-based stochastic optimization algorithm for large-scale empirical risk minimization. Specifically, we compute a partial column Hessian of size ($d\times m$) with $m\ll d$ randomly selected variables, then use the \emph{Nystr\"om method} to better approximate the full Hessian matrix. To further reduce the computational complexity per iteration, we directly compute the update step ($\Delta\boldsymbol{w}$) without computing and storing the full Hessian or its inverse. We then integrate our approximated Hessian with stochastic gradient descent and stochastic variance-reduced gradient methods. The results of numerical experiments on both convex and non-convex functions show that the proposed approach was able to obtain a better approximation of Newton\textquotesingle s method, exhibiting performance competitive with that of state-of-the-art first-order and stochastic quasi-Newton methods. Furthermore, we provide a theoretical convergence analysis for convex functions.
Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:2110.08577 [math.OC]
  (or arXiv:2110.08577v2 [math.OC] for this version)

Submission history

From: Dinesh Singh [view email]
[v1] Sat, 16 Oct 2021 14:04:51 GMT (823kb,D)
[v2] Sat, 29 Jan 2022 21:25:27 GMT (1121kb,D)

Link back to: arXiv, form interface, contact.