We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.IT

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Information Theory

Title: Sharp asymptotics on the compression of two-layer neural networks

Abstract: In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with M < N nodes. More precisely, we consider the setting in which the weights of the target network are i.i.d. sub-Gaussian, and we minimize the population L2 loss between the outputs of the target and of the compressed network, under the assumption of Gaussian inputs. By using tools from high-dimensional probability, we show that this non-convex problem can be simplified when the target network is sufficiently over-parameterized, and provide the error rate of this approximation as a function of the input dimension and N . For a ReLU activation function, we conjecture that the optimum of the simplified optimization problem is achieved by taking weights on the Equiangular Tight Frame (ETF), while the scaling of the weights and the orientation of the ETF depend on the parameters of the target network. Numerical evidence is provided to support this conjecture.
Subjects: Information Theory (cs.IT); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:2205.08199 [cs.IT]
  (or arXiv:2205.08199v2 [cs.IT] for this version)

Submission history

From: Mohammad Hossein Amani [view email]
[v1] Tue, 17 May 2022 09:45:23 GMT (1389kb)
[v2] Wed, 18 May 2022 08:57:56 GMT (816kb)

Link back to: arXiv, form interface, contact.