We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Sparsity in long-time control of neural ODEs

Abstract: We consider the neural ODE and optimal control perspective of supervised learning, with $\ell^1$-control penalties, where rather than only minimizing a final cost (the \emph{empirical risk}) for the state, we integrate this cost over the entire time horizon. We prove that any optimal control (for this cost) vanishes beyond some positive stopping time. When seen in the discrete-time context, this result entails an \emph{ordered} sparsity pattern for the parameters of the associated residual neural network: ordered in the sense that these parameters are all $0$ beyond a certain layer. Furthermore, we provide a polynomial stability estimate for the empirical risk with respect to the time horizon. This can be seen as a \emph{turnpike property}, for nonsmooth dynamics and functionals with $\ell^1$-penalties, and without any smallness assumptions on the data, both of which are new in the literature.
Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as: arXiv:2102.13566 [cs.LG]
  (or arXiv:2102.13566v3 [cs.LG] for this version)

Submission history

From: Borjan Geshkovski [view email]
[v1] Fri, 26 Feb 2021 16:23:02 GMT (2715kb,D)
[v2] Tue, 19 Oct 2021 10:32:57 GMT (3887kb,D)
[v3] Thu, 8 Sep 2022 16:02:48 GMT (20279kb,D)

Link back to: arXiv, form interface, contact.