We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.ST

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: A Multi-resolution Theory for Approximating Infinite-$p$-Zero-$n$: Transitional Inference, Individualized Predictions, and a World Without Bias-Variance Trade-off

Abstract: Transitional inference is an empiricism concept, rooted and practiced in clinical medicine since ancient Greece. Knowledge and experiences gained from treating one entity are applied to treat a related but distinctively different one. This notion of "transition to the similar" renders individualized treatments an operational meaning, yet its theoretical foundation defies the familiar inductive inference framework. The uniqueness of entities is the result of potentially an infinite number of attributes (hence $p=\infty$), which entails zero direct training sample size (i.e., $n=0$) because genuine guinea pigs do not exist. However, the literature on wavelets and on sieve methods suggests a principled approximation theory for transitional inference via a multi-resolution (MR) perspective, where we use the resolution level to index the degree of approximation to ultimate individuality. MR inference seeks a primary resolution indexing an indirect training sample, which provides enough matched attributes to increase the relevance of the results to the target individuals and yet still accumulate sufficient indirect sample sizes for robust estimation. Theoretically, MR inference relies on an infinite-term ANOVA-type decomposition, providing an alternative way to model sparsity via the decay rate of the resolution bias as a function of the primary resolution level. Unexpectedly, this decomposition reveals a world without variance when the outcome is a deterministic function of potentially infinitely many predictors. In this deterministic world, the optimal resolution prefers over-fitting in the traditional sense when the resolution bias decays sufficiently rapidly. Furthermore, there can be many "descents" in the prediction error curve, when the contributions of predictors are inhomogeneous and the ordering of their importance does not align with the order of their inclusion in prediction.
Subjects: Statistics Theory (math.ST)
Cite as: arXiv:2010.08876 [math.ST]
  (or arXiv:2010.08876v3 [math.ST] for this version)

Submission history

From: Xinran Li [view email]
[v1] Sat, 17 Oct 2020 21:56:59 GMT (308kb,D)
[v2] Tue, 27 Oct 2020 05:14:32 GMT (319kb,D)
[v3] Mon, 14 Dec 2020 08:20:33 GMT (319kb,D)

Link back to: arXiv, form interface, contact.