References & Citations
Statistics > Methodology
Title: Linear Regression, Covariate Selection and the Failure of Modelling
(Submitted on 16 Dec 2021 (v1), last revised 22 Feb 2022 (this version, v4))
Abstract: It is argued that all model based approaches to the selection of covariates in linear regression have failed. This applies to frequentist approaches based on P-values and to Bayesian approaches although for different reasons. In the first part of the paper 13 model based procedures are compared to the model-free Gaussian covariate procedure in terms of the covariates selected and the time required. The comparison is based on seven data sets and three simulations. There is nothing special about these data sets which are often used as examples in the literature. All the model based procedures failed.
In the second part of the paper it is argued that the cause of this failure is the very use of a model. If the model involves all the available covariates standard P-values can be used. The use of P-values in this situation is quite straightforward. As soon as the model specifies only some unknown subset of the covariates the problem being to identify this subset the situation changes radically. There are many P-values, they are dependent and most of them are invalid. The P-value based approach collapses. The Bayesian paradigm also assumes a correct model but although there are no conceptual problems with a large number of covariates there is a considerable overhead causing computational and allocation problems even for moderately sized data sets.
The Gaussian covariate procedure is based on P-values which are defined as the probability that a random Gaussian covariate is better than the covariate being considered. These P-values are exact and valid whatever the situation. The allocation requirements and the algorithmic complexity are both linear in the size of the data making the procedure capable of handling large data sets. It outperforms all the other procedures in every respect.
Submission history
From: Patrick Laurie Davies Mr [view email][v1] Thu, 16 Dec 2021 09:44:18 GMT (28kb,D)
[v2] Mon, 7 Feb 2022 10:15:34 GMT (33kb,D)
[v3] Fri, 11 Feb 2022 10:30:17 GMT (33kb,D)
[v4] Tue, 22 Feb 2022 10:14:51 GMT (33kb,D)
Link back to: arXiv, form interface, contact.