We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.ST

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Hypothesis Testing for Sparse Binary Regression

Abstract: In this paper, we study the detection boundary for hypothesis testing in the context of high-dimensional logistic regression. Many of our results also apply to general binary regression models. We observe a new phenomenon in the behavior of detection boundary which does not occur in the Gaussian framework. Suppose there are $n$ samples of binary outcomes, $p$ covariates for each, and the outcomes are related to the covariates by a logistic model. We are interested in testing a global null hypothesis that the regression coefficients are all zero and the alternative is sparse with $k$ signals, where $k = p^{1-\alpha}$ and $\alpha \in [0, 1]$. We show that the detection problem depends heavily on the structure and sparsity of the design matrix. In the context of a balanced one-way design matrix, we show that the number of repeated observations decides the detection complexity. If the number of replications is too low, unlike the Gaussian case, detection becomes impossible irrespective of the signal strength. However, after a certain threshold of the number of replications, our results are parallel to the Gaussian case. In such cases we derive the sharp detection boundary for both dense ($\alpha \leq \frac{1}{2}$) and sparse ($\alpha > \frac{1}{2}$) regimes. In the dense regime the generalized likelihood ratio test continues to be asymptotically powerful above the detection boundary. In the sparse regime, however, we need to design a new test which is a version of the popular Higher Criticism test. We show that this new test attains the detection boundary as a sharp upper bound.
Subjects: Statistics Theory (math.ST); Computation (stat.CO); Methodology (stat.ME)
Cite as: arXiv:1308.0764 [math.ST]
  (or arXiv:1308.0764v1 [math.ST] for this version)

Submission history

From: Rajarshi Mukherjee [view email]
[v1] Sun, 4 Aug 2013 01:06:15 GMT (134kb,D)
[v2] Mon, 28 Jul 2014 02:42:31 GMT (109kb,D)
[v3] Thu, 5 Mar 2015 10:30:23 GMT (216kb)

Link back to: arXiv, form interface, contact.