The Least Wrong Model Is Not in the Data

Stiffelman, Oscar

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1404

Change to browse by:

Computer Science > Machine Learning

Title: The Least Wrong Model Is Not in the Data

Authors: Oscar Stiffelman

(Submitted on 3 Apr 2014 (v1), last revised 17 Apr 2014 (this version, v3))

Abstract: The true process that generated data cannot be determined when multiple explanations are possible. Prediction requires a model of the probability that a process, chosen randomly from the set of candidate explanations, generates some future observation. The best model includes all of the information contained in the minimal description of the data that is not contained in the data. It is closely related to the Halting Problem and is logarithmic in the size of the data. Prediction is difficult because the ideal model is not computable, and the best computable model is not "findable." However, the error from any approximation can be bounded by the size of the description using the model.

Comments:	added citations and acknowledgements, and replaced the ideal model section with a more intuitive argument
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1404.0789 [cs.LG]
	(or arXiv:1404.0789v3 [cs.LG] for this version)

Submission history

From: Oscar Stiffelman [view email]
[v1] Thu, 3 Apr 2014 07:41:46 GMT (10kb)
[v2] Fri, 4 Apr 2014 08:58:30 GMT (10kb)
[v3] Thu, 17 Apr 2014 19:53:46 GMT (11kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1404.0789

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: The Least Wrong Model Is Not in the Data

Submission history