References & Citations
Computer Science > Machine Learning
Title: The Least Wrong Model Is Not in the Data
(Submitted on 3 Apr 2014 (v1), last revised 17 Apr 2014 (this version, v3))
Abstract: The true process that generated data cannot be determined when multiple explanations are possible. Prediction requires a model of the probability that a process, chosen randomly from the set of candidate explanations, generates some future observation. The best model includes all of the information contained in the minimal description of the data that is not contained in the data. It is closely related to the Halting Problem and is logarithmic in the size of the data. Prediction is difficult because the ideal model is not computable, and the best computable model is not "findable." However, the error from any approximation can be bounded by the size of the description using the model.
Submission history
From: Oscar Stiffelman [view email][v1] Thu, 3 Apr 2014 07:41:46 GMT (10kb)
[v2] Fri, 4 Apr 2014 08:58:30 GMT (10kb)
[v3] Thu, 17 Apr 2014 19:53:46 GMT (11kb)
Link back to: arXiv, form interface, contact.