We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: The Blessing of Dimensionality: Separation Theorems in the Thermodynamic Limit

Abstract: We consider and analyze properties of large sets of randomly selected (i.i.d.) points in high dimensional spaces. In particular, we consider the problem of whether a single data point that is randomly chosen from a finite set of points can be separated from the rest of the data set by a linear hyperplane. We formulate and prove stochastic separation theorems, including: 1) with probability close to one a random point may be separated from a finite random set by a linear functional; 2) with probability close to one for every point in a finite random set there is a linear functional separating this point from the rest of the data. The total number of points in the random sets are allowed to be exponentially large with respect to dimension. Various laws governing distributions of points are considered, and explicit formulae for the probability of separation are provided. These theorems reveal an interesting implication for machine learning and data mining applications that deal with large data sets (big data) and high-dimensional data (many attributes): simple linear decision rules and learning machines are surprisingly efficient tools for separating and filtering out arbitrarily assigned points in large dimensions.
Comments: A talk given at TFMST 2016, 2nd IFAC Workshop on Thermodynamic Foundations of Mathematical Systems Theory. September 28-30, 2016, Vigo, Spain
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as: arXiv:1610.00494 [stat.ML]
  (or arXiv:1610.00494v1 [stat.ML] for this version)

Submission history

From: Ivan Y. Tyukin [view email]
[v1] Mon, 3 Oct 2016 11:15:12 GMT (258kb)
[v2] Thu, 24 Nov 2016 18:52:25 GMT (675kb)
[v3] Sun, 6 Aug 2017 15:24:27 GMT (610kb)
[v4] Wed, 13 Feb 2019 09:14:55 GMT (1235kb)

Link back to: arXiv, form interface, contact.