Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time
(Submitted on 26 Jun 2020 (v1), last revised 18 Mar 2021 (this version, v3))
Abstract: We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an $\ell_2$ norm regularized convex program. We then show that multi-layer circular CNN training problems with a single ReLU layer are equivalent to an $\ell_1$ regularized convex program that encourages sparsity in the spectral domain. We also extend these results to three-layer CNNs with two ReLU layers. Furthermore, we present extensions of our approach to different pooling methods, which elucidates the implicit architectural bias as convex regularizers.
Submission history
From: Tolga Ergen [view email][v1] Fri, 26 Jun 2020 04:47:20 GMT (735kb)
[v2] Mon, 5 Oct 2020 15:17:31 GMT (748kb)
[v3] Thu, 18 Mar 2021 15:30:26 GMT (2117kb,D)
Link back to: arXiv, form interface, contact.