We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: One-Shot Coresets: The Case of k-Clustering

Abstract: Scaling clustering algorithms to massive data sets is a challenging task. Recently, several successful approaches based on data summarization methods, such as coresets and sketches, were proposed. While these techniques provide provably good and small summaries, they are inherently problem dependent - the practitioner has to commit to a fixed clustering objective before even exploring the data. However, can one construct small data summaries for a wide range of clustering problems simultaneously? In this work, we affirmatively answer this question by proposing an efficient algorithm that constructs such one-shot summaries for k-clustering problems while retaining strong theoretical guarantees.
Comments: To Appear In AISTATS 2018
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as: arXiv:1711.09649 [stat.ML]
  (or arXiv:1711.09649v3 [stat.ML] for this version)

Submission history

From: Olivier Bachem [view email]
[v1] Mon, 27 Nov 2017 12:33:20 GMT (336kb,D)
[v2] Tue, 28 Nov 2017 15:31:27 GMT (336kb,D)
[v3] Tue, 20 Feb 2018 15:22:51 GMT (349kb,D)

Link back to: arXiv, form interface, contact.