We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Machine learning meets false discovery rate

Abstract: Classical false discovery rate (FDR) controlling procedures offer strong and interpretable guarantees, while they often lack of flexibility. On the other hand, recent machine learning classification algorithms, as those based on random forests (RF) or neural networks (NN), have great practical performances but lack of interpretation and of theoretical guarantees. In this paper, we make these two meet by introducing a new adaptive novelty detection procedure with FDR control, called AdaDetect. It extends the scope of recent works of multiple testing literature to the high dimensional setting, notably the one in Yang et al. (2021). AdaDetect is shown to both control strongly the FDR and to have a power that mimics the one of the oracle in a specific sense. The interest and validity of our approach is demonstrated with theoretical results, numerical experiments on several benchmark datasets and with an application to astrophysical data. In particular, while AdaDetect can be used in combination with any classifier, it is particularly efficient on real-world datasets with RF, and on images with NN.
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as: arXiv:2208.06685 [stat.ME]
  (or arXiv:2208.06685v1 [stat.ME] for this version)

Submission history

From: Ariane Marandon [view email]
[v1] Sat, 13 Aug 2022 17:14:55 GMT (5365kb,D)
[v2] Sat, 22 Oct 2022 08:35:12 GMT (5319kb,D)

Link back to: arXiv, form interface, contact.