We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Continuous vs. Discrete Optimization of Deep Neural Networks

Abstract: Existing analyses of optimization in deep learning are either continuous, focusing on (variants of) gradient flow, or discrete, directly treating (variants of) gradient descent. Gradient flow is amenable to theoretical analysis, but is stylized and disregards computational efficiency. The extent to which it represents gradient descent is an open question in the theory of deep learning. The current paper studies this question. Viewing gradient descent as an approximate numerical solution to the initial value problem of gradient flow, we find that the degree of approximation depends on the curvature around the gradient flow trajectory. We then show that over deep neural networks with homogeneous activations, gradient flow trajectories enjoy favorable curvature, suggesting they are well approximated by gradient descent. This finding allows us to translate an analysis of gradient flow over deep linear neural networks into a guarantee that gradient descent efficiently converges to global minimum almost surely under random initialization. Experiments suggest that over simple deep neural networks, gradient descent with conventional step size is indeed close to gradient flow. We hypothesize that the theory of gradient flows will unravel mysteries behind deep learning.
Comments: Published as spotlight paper at the conference on Neural Information Processing Systems (NeurIPS) 2021
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Cite as: arXiv:2107.06608 [cs.LG]
  (or arXiv:2107.06608v3 [cs.LG] for this version)

Submission history

From: Omer Elkabetz [view email]
[v1] Wed, 14 Jul 2021 10:59:57 GMT (594kb,D)
[v2] Wed, 1 Dec 2021 18:31:09 GMT (2124kb,D)
[v3] Tue, 28 Dec 2021 11:39:25 GMT (2123kb,D)

Link back to: arXiv, form interface, contact.