We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: NeuralFDR: Learning Discovery Thresholds from Hypothesis Features

Abstract: As datasets grow richer, an important challenge is to leverage the full features in the data to maximize the number of useful discoveries while controlling for false positives. We address this problem in the context of multiple hypotheses testing, where for each hypothesis, we observe a p-value along with a set of features specific to that hypothesis. For example, in genetic association studies, each hypothesis tests the correlation between a variant and the trait. We have a rich set of features for each variant (e.g. its location, conservation, epigenetics etc.) which could inform how likely the variant is to have a true association. However popular testing approaches, such as Benjamini-Hochberg's procedure (BH) and independent hypothesis weighting (IHW), either ignore these features or assume that the features are categorical or uni-variate. We propose a new algorithm, NeuralFDR, which automatically learns a discovery threshold as a function of all the hypothesis features. We parametrize the discovery threshold as a neural network, which enables flexible handling of multi-dimensional discrete and continuous features as well as efficient end-to-end optimization. We prove that NeuralFDR has strong false discovery rate (FDR) guarantees, and show that it makes substantially more discoveries in synthetic and real datasets. Moreover, we demonstrate that the learned discovery threshold is directly interpretable.
Subjects: Methodology (stat.ME); Applications (stat.AP); Machine Learning (stat.ML)
Cite as: arXiv:1711.01312 [stat.ME]
  (or arXiv:1711.01312v4 [stat.ME] for this version)

Submission history

From: Martin Zhang [view email]
[v1] Fri, 3 Nov 2017 19:27:11 GMT (4468kb,D)
[v2] Tue, 7 Nov 2017 08:22:26 GMT (4788kb,D)
[v3] Wed, 8 Nov 2017 06:01:26 GMT (4788kb,D)
[v4] Sat, 18 Nov 2017 20:44:38 GMT (4787kb,D)

Link back to: arXiv, form interface, contact.