We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: PMLB v1.0: An open source dataset collection for benchmarking machine learning methods

Abstract: Motivation: Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows.
Results: This release of PMLB provides the largest collection of diverse, public benchmark datasets for evaluating new machine learning and data science methods aggregated in one location. v1.0 introduces a number of critical improvements developed following discussions with the open-source community.
Availability: PMLB is available at this https URL Python and R interfaces for PMLB can be installed through the Python Package Index and Comprehensive R Archive Network, respectively.
Comments: 4 pages, 1 figure. *: These authors contributed equally
Subjects: Machine Learning (cs.LG); Databases (cs.DB)
ACM classes: H.2.8
Cite as: arXiv:2012.00058 [cs.LG]
  (or arXiv:2012.00058v3 [cs.LG] for this version)

Submission history

From: Trang Le [view email]
[v1] Mon, 30 Nov 2020 19:21:44 GMT (365kb)
[v2] Sun, 4 Apr 2021 20:31:09 GMT (1452kb)
[v3] Tue, 6 Apr 2021 12:37:35 GMT (1452kb)

Link back to: arXiv, form interface, contact.