We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Mathematics > Statistics Theory

Title: High-dimensional variable selection with heterogeneous signals: A precise asymptotic perspective

Abstract: We study the problem of exact support recovery for high-dimensional sparse linear regression when the signals are weak, rare and possibly heterogeneous. Specifically, we fix the minimum signal magnitude at the information-theoretic optimal rate and investigate the asymptotic selection accuracy of best subset selection (BSS) and marginal screening (MS) procedures under independent Gaussian design. Despite of the ideal setup, somewhat surprisingly, marginal screening can fail to achieve exact recovery with probability converging to one in the presence of heterogeneous signals, whereas BSS enjoys model consistency whenever the minimum signal strength is above the information-theoretic threshold. To mitigate the computational issue of BSS, we also propose a surrogate two-stage algorithm called ETS (Estimate Then Screen) based on iterative hard thresholding and gradient coordinate screening, and we show that ETS shares exactly the same asymptotic optimality in terms of exact recovery as BSS. Finally, we present a simulation study comparing ETS with LASSO and marginal screening. The numerical results echo with our asymptotic theory even for realistic values of the sample size, dimension and sparsity.
Comments: 30 pages, 3 figures
Subjects: Statistics Theory (math.ST)
Cite as: arXiv:2201.01508 [math.ST]
  (or arXiv:2201.01508v2 [math.ST] for this version)

Submission history

From: Saptarshi Roy [view email]
[v1] Wed, 5 Jan 2022 09:26:31 GMT (5524kb,D)
[v2] Sat, 10 Sep 2022 20:06:27 GMT (5904kb,D)

Link back to: arXiv, form interface, contact.