We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: Copula Quadrant Similarity for Anomaly Scores

Abstract: Practical anomaly detection requires applying numerous approaches due to the inherent difficulty of unsupervised learning. Direct comparison between complex or opaque anomaly detection algorithms is intractable; we instead propose a framework for associating the scores of multiple methods. Our aim is to answer the question: how should one measure the similarity between anomaly scores generated by different methods? The scoring crux is the extremes, which identify the most anomalous observations. A pair of algorithms are defined here to be similar if they assign their highest scores to roughly the same small fraction of observations. To formalize this, we propose a measure based on extremal similarity in scoring distributions through a novel upper quadrant modeling approach, and contrast it with tail and other dependence measures. We illustrate our method with simulated and real experiments, applying spectral methods to cluster multiple anomaly detection methods and to contrast our similarity measure with others. We demonstrate that our method is able to detect the clusters of anomaly detection algorithms to achieve an accurate and robust ensemble algorithm.
Comments: 17 pages, 11 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as: arXiv:2101.02330 [stat.ML]
  (or arXiv:2101.02330v1 [stat.ML] for this version)

Submission history

From: Matthew Davidow [view email]
[v1] Thu, 7 Jan 2021 02:19:36 GMT (5602kb,D)

Link back to: arXiv, form interface, contact.