We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

Abstract: Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question. We propose a phenomenological model of the NN training to explain this non-overfitting puzzle. Our linear frequency principle (LFP) model accounts for a key dynamical feature of NNs: they learn low frequencies first, irrespective of microscopic details. Theory based on our LFP model shows that low frequency dominance of target functions is the key condition for the non-overfitting of NNs and is verified by experiments. Furthermore, through an ideal two-layer NN, we unravel how detailed microscopic NN training dynamics statistically gives rise to a LFP model with quantitative prediction power.
Comments: to appear in Chinese Physics Letters
Subjects: Machine Learning (cs.LG); Data Analysis, Statistics and Probability (physics.data-an)
DOI: 10.1088/0256-307X/38/3/038701
Cite as: arXiv:2102.00200 [cs.LG]
  (or arXiv:2102.00200v1 [cs.LG] for this version)

Submission history

From: Zhiqin Xu [view email]
[v1] Sat, 30 Jan 2021 10:11:37 GMT (92kb,D)

Link back to: arXiv, form interface, contact.