We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Towards Understanding the Condensation of Neural Networks at Initial Training

Abstract: Empirical works show that for ReLU neural networks (NNs) with small initialization, input weights of hidden neurons (the input weight of a hidden neuron consists of the weight from its input layer to the hidden neuron and its bias term) condense on isolated orientations. The condensation dynamics implies that the training implicitly regularizes a NN towards one with a much smaller effective size. In this work, we illustrate the formation of the condensation in multi-layer fully connected NNs and show that the maximal number of condensed orientations in the initial training stage is twice the multiplicity of the activation function, where "multiplicity" indicates the multiple roots of activation function at origin. Our theoretical analysis confirms experiments for two cases, one is for the activation function of multiplicity one with arbitrary dimension input, which contains many common activation functions, and the other is for the layer with one-dimensional input and arbitrary multiplicity. This work makes a step towards understanding how small initialization leads NNs to condensation at the initial training stage.
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
MSC classes: 68T07
Cite as: arXiv:2105.11686 [cs.LG]
  (or arXiv:2105.11686v6 [cs.LG] for this version)

Submission history

From: Hanxu Zhou [view email]
[v1] Tue, 25 May 2021 05:47:55 GMT (20395kb,D)
[v2] Sat, 29 May 2021 04:23:57 GMT (20278kb,D)
[v3] Tue, 10 Aug 2021 16:45:00 GMT (22580kb,D)
[v4] Wed, 17 Nov 2021 20:33:10 GMT (17563kb,D)
[v5] Sat, 21 May 2022 15:53:08 GMT (22893kb,D)
[v6] Wed, 19 Oct 2022 17:32:30 GMT (69489kb,D)

Link back to: arXiv, form interface, contact.