We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.ST

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Estimating false inclusion rates in penalized regression models

Abstract: Penalized regression methods are an attractive tool for feature selection with many appealing properties, although their widespread adoption has been hampered by the difficulty of applying inferential tools. In particular, the question "How reliable is the selection of those features?" has proved difficult to address, partially due to the complexity of defining a false discovery in the penalized regression setting. Here, I define a false inclusion as a variable that is independent of the outcome regardless of whether other variables are conditioned on. This definition permits straightforward estimation of the number of false inclusions. Theoretical analysis and simulation studies demonstrate that this approach is quite accurate when the correlation among predictors is mild, and slightly conservative when the correlation is moderate. Finally, the practical utility of the proposed method is illustrated using gene expression data from The Cancer Genome Atlas and GWAS data from the Myocardial Applied Genomics Network.
Comments: 14 pages, 7 figures
Subjects: Statistics Theory (math.ST)
Cite as: arXiv:1607.05636 [math.ST]
  (or arXiv:1607.05636v1 [math.ST] for this version)

Submission history

From: Patrick Breheny [view email]
[v1] Tue, 19 Jul 2016 15:37:25 GMT (53kb,D)
[v2] Fri, 7 Apr 2017 14:31:58 GMT (70kb,D)

Link back to: arXiv, form interface, contact.