We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Adapting to Unknown Sparsity by controlling the False Discovery Rate

Abstract: We attempt to recover an $n$-dimensional vector observed in white noise, where $n$ is large and the vector is known to be sparse, but the degree of sparsity is unknown. We consider three different ways of defining sparsity of a vector: using the fraction of nonzero terms; imposing power-law decay bounds on the ordered entries; and controlling the $\ell_p$ norm for $p$ small. We obtain a procedure which is asymptotically minimax for $\ell^r$ loss, simultaneously throughout a range of such sparsity classes.
The optimal procedure is a data-adaptive thresholding scheme, driven by control of the {\it False Discovery Rate} (FDR). FDR control is a relatively recent innovation in simultaneous testing, ensuring that at most a certain fraction of the rejected null hypotheses will correspond to false rejections.
In our treatment, the FDR control parameter $q_n$ also plays a determining role in asymptotic minimaxity. If $q = \lim q_n \in [0,1/2]$ and also $q_n > \gamma/\log(n)$ we get sharp asymptotic minimaxity, simultaneously, over a wide range of sparse parameter spaces and loss functions. On the other hand, $ q = \lim q_n \in (1/2,1]$, forces the risk to exceed the minimax risk by a factor growing with $q$.
To our knowledge, this relation between ideas in simultaneous inference and asymptotic decision theory is new.
Our work provides a new perspective on a class of model selection rules which has been introduced recently by several authors. These new rules impose complexity penalization of the form $2 \cdot \log({potential model size} / {actual model size})$. We exhibit a close connection with FDR-controlling procedures under stringent control of the false discovery rate.
Comments: This is a complete version of a paper to appear in Annals of Statitistics. The paper in AoS has certain proofs abbreviated that are given here in detail
Subjects: Statistics Theory (math.ST)
MSC classes: 62F10; 62G12
Cite as: arXiv:math/0505374 [math.ST]
  (or arXiv:math/0505374v1 [math.ST] for this version)

Submission history

From: David Donoho [view email]
[v1] Wed, 18 May 2005 06:32:35 GMT (147kb)

Link back to: arXiv, form interface, contact.