Probabilistic Best Subset Selection via Gradient-Based Optimization

Yin, Mingzhang; Ho, Nhat; Yan, Bowei; Qian, Xiaoning; Zhou, Mingyuan

Full-text links:

Download:

Current browse context:

stat.ME

< prev | next >

new | recent | 2006

Statistics > Methodology

Title: Probabilistic Best Subset Selection via Gradient-Based Optimization

Authors: Mingzhang Yin, Nhat Ho, Bowei Yan, Xiaoning Qian, Mingyuan Zhou

(Submitted on 11 Jun 2020 (v1), revised 7 Aug 2020 (this version, v3), latest version 1 Jun 2022 (v4))

Abstract: In high-dimensional statistics, variable selection is an optimization problem aiming to recover the latent sparse pattern from all possible covariate combinations. In this paper, we propose a novel optimization method to solve the exact $L_0$-regularized regression problem (a.k.a. best subset selection). We reformulate the optimization problem from a discrete space to a continuous one via probabilistic reparameterization. Within the framework of stochastic gradient descent, we propose a family of unbiased gradient estimators to optimize the $L_0$-regularized objective and a variational lower bound. Within this family, we identify the estimator with a non-vanishing signal-to-noise ratio and uniformly minimum variance. Theoretically, we study the general conditions under which the method is guaranteed to converge to the ground truth in expectation. In a wide variety of synthetic and semi-synthetic data sets, the proposed method outperforms existing variable selection methods that are based on penalized regression and mixed-integer optimization, in both sparse pattern recovery and out-of-sample prediction. Our method can find the true regression model from thousands of covariates in a couple of seconds. a

Subjects:	Methodology (stat.ME)
Cite as:	arXiv:2006.06448 [stat.ME]
	(or arXiv:2006.06448v3 [stat.ME] for this version)

Submission history

From: Mingzhang Yin [view email]
[v1] Thu, 11 Jun 2020 13:57:29 GMT (217kb,D)
[v2] Mon, 22 Jun 2020 18:28:46 GMT (217kb,D)
[v3] Fri, 7 Aug 2020 04:23:34 GMT (154kb,D)
[v4] Wed, 1 Jun 2022 01:59:07 GMT (215kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2006.06448v3

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Methodology

Title: Probabilistic Best Subset Selection via Gradient-Based Optimization

Submission history