We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Machine learning meets false discovery rate

Abstract: Classical false discovery rate (FDR) controlling procedures offer strong and interpretable guarantees but often lack flexibility to work with complex data. By contrast, machine learning-based classification algorithms have superior performances on modern datasets but typically fall short of error-controlling guarantees. In this paper, we make these two meet by introducing a new adaptive novelty detection procedure with FDR control, called AdaDetect. It extends the scope of recent works of multiple testing literature to the high dimensional setting, notably the one in Yang et al. (2021). We prove that AdaDetect comes with finite sample guarantees: it controls the FDR strongly and approximates the oracle in terms of the power, with explicit remainder terms that are small under mild conditions. In practice, AdaDetect can be used in combination with any machine learning-based classifier, which allows the user to choose the most relevant classification approach. We illustrate this with classical real-world datasets, for which random forest and neural network classifiers are particularly efficient. The versatility of our method is also shown with an astrophysical application.
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as: arXiv:2208.06685 [stat.ME]
  (or arXiv:2208.06685v2 [stat.ME] for this version)

Submission history

From: Ariane Marandon [view email]
[v1] Sat, 13 Aug 2022 17:14:55 GMT (5365kb,D)
[v2] Sat, 22 Oct 2022 08:35:12 GMT (5319kb,D)

Link back to: arXiv, form interface, contact.