We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Robust subset selection

Authors: Ryan Thompson
Abstract: The best subset selection (or "best subsets") estimator is a classic tool for sparse regression, and developments in mathematical optimization over the past decade have made it more computationally tractable than ever. Notwithstanding its desirable statistical properties, the best subsets estimator is susceptible to outliers and can break down in the presence of a single contaminated data point. To address this issue, we propose a robust adaption of best subsets that is highly resistant to contamination in both the response and the predictors. Our estimator generalizes the notion of subset selection to both predictors and observations, thereby achieving robustness in addition to sparsity. This procedure, which we call "robust subset selection" (or "robust subsets"), is defined by a combinatorial optimization problem for which we apply modern discrete optimization methods. We formally establish the robustness of our estimator in terms of the finite-sample breakdown point of its objective value. In support of this result, we report experiments on both synthetic and real data that demonstrate the superiority of robust subsets over best subsets in the presence of contamination. Importantly, robust subsets fares competitively across several metrics compared with popular robust adaptions of the Lasso.
Subjects: Methodology (stat.ME)
Cite as: arXiv:2005.08217 [stat.ME]
  (or arXiv:2005.08217v2 [stat.ME] for this version)

Submission history

From: Ryan Thompson [view email]
[v1] Sun, 17 May 2020 10:56:33 GMT (72kb,D)
[v2] Tue, 2 Jun 2020 09:00:24 GMT (72kb,D)
[v3] Mon, 10 Jan 2022 03:22:35 GMT (102kb,D)

Link back to: arXiv, form interface, contact.