We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Scikit-dimension: a Python package for intrinsic dimension estimation

Abstract: Dealing with uncertainty in applications of machine learning to real-life data critically depends on the knowledge of intrinsic dimensionality (ID). A number of methods have been suggested for the purpose of estimating ID, but no standard package to easily apply them one by one or all at once has been implemented in Python. This technical note introduces \texttt{scikit-dimension}, an open-source Python package for intrinsic dimension estimation. \texttt{scikit-dimension} package provides a uniform implementation of most of the known ID estimators based on scikit-learn application programming interface to evaluate global and local intrinsic dimension, as well as generators of synthetic toy and benchmark datasets widespread in the literature. The package is developed with tools assessing the code quality, coverage, unit testing and continuous integration. We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation in real-life and synthetic data. The source code is available from this https URL , the documentation is available from this https URL .
Comments: 12 pages, 4 figures, 1 table
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
DOI: 10.3390/e23101368
Cite as: arXiv:2109.02596 [cs.LG]
  (or arXiv:2109.02596v1 [cs.LG] for this version)

Submission history

From: Andrei Zinovyev Dr. [view email]
[v1] Mon, 6 Sep 2021 16:46:38 GMT (7653kb,D)

Link back to: arXiv, form interface, contact.