We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cond-mat.stat-mech

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Condensed Matter > Statistical Mechanics

Title: Statistical learning theory of structured data

Abstract: The traditional approach of statistical physics to supervised learning routinely assumes unrealistic generative models for the data: usually inputs are independent random variables, uncorrelated with their labels. Only recently, statistical physicists started to explore more complex forms of data, such as equally-labelled points lying on (possibly low dimensional) object manifolds. Here we provide a bridge between this recently-established research area and the framework of statistical learning theory, a branch of mathematics devoted to inference in machine learning. The overarching motivation is the inadequacy of the classic rigorous results in explaining the remarkable generalization properties of deep learning. We propose a way to integrate physical models of data into statistical learning theory, and address, with both combinatorial and statistical mechanics methods, the computation of the Vapnik-Chervonenkis entropy, which counts the number of different binary classifications compatible with the loss class. As a proof of concept, we focus on kernel machines and on two simple realizations of data structure introduced in recent physics literature: $k$-dimensional simplexes with prescribed geometric relations and spherical manifolds (equivalent to margin classification). Entropy, contrary to what happens for unstructured data, is nonmonotonic in the sample size, in contrast with the rigorous bounds. Moreover, data structure induces a novel transition beyond the storage capacity, which we advocate as a proxy of the nonmonotonicity, and ultimately a cue of low generalization error. The identification of a synaptic volume vanishing at the transition allows a quantification of the impact of data structure within replica theory, applicable in cases where combinatorial methods are not available, as we demonstrate for margin learning.
Comments: 19 pages, 3 figures
Subjects: Statistical Mechanics (cond-mat.stat-mech); Disordered Systems and Neural Networks (cond-mat.dis-nn)
Journal reference: Phys. Rev. E 102, 032119 (2020)
DOI: 10.1103/PhysRevE.102.032119
Cite as: arXiv:2005.10002 [cond-mat.stat-mech]
  (or arXiv:2005.10002v1 [cond-mat.stat-mech] for this version)

Submission history

From: Pietro Rotondo [view email]
[v1] Wed, 20 May 2020 12:40:49 GMT (194kb,D)

Link back to: arXiv, form interface, contact.