References & Citations
Mathematics > Statistics Theory
Title: Bootstrapping and Sample Splitting For High-Dimensional, Assumption-Free Inference
(Submitted on 16 Nov 2016 (v1), last revised 2 Apr 2018 (this version, v2))
Abstract: Several new methods have been proposed for performing valid inference after model selection. An older method is sampling splitting: use part of the data for model selection and part for inference. In this paper we revisit sample splitting combined with the bootstrap (or the Normal approximation). We show that this leads to a simple, assumption-free approach to inference and we establish results on the accuracy of the method. In fact, we find new bounds on the accuracy of the bootstrap and the Normal approximation for general nonlinear parameters with increasing dimension which we then use to assess the accuracy of regression inference. We show that an alternative, called the image bootstrap, has higher coverage accuracy at the cost of more computation. We define new parameters that measure variable importance and that can be inferred with greater accuracy than the usual regression coefficients. There is a inference-prediction tradeoff: splitting increases the accuracy and robustness of inference but can decrease the accuracy of the predictions.
Submission history
From: Larry Wasserman [view email][v1] Wed, 16 Nov 2016 18:34:47 GMT (70kb,D)
[v2] Mon, 2 Apr 2018 19:01:47 GMT (91kb,D)
Link back to: arXiv, form interface, contact.