Current browse context:
stat.ML
Change to browse by:
References & Citations
Statistics > Machine Learning
Title: One-Shot Coresets: The Case of k-Clustering
(Submitted on 27 Nov 2017 (v1), last revised 20 Feb 2018 (this version, v3))
Abstract: Scaling clustering algorithms to massive data sets is a challenging task. Recently, several successful approaches based on data summarization methods, such as coresets and sketches, were proposed. While these techniques provide provably good and small summaries, they are inherently problem dependent - the practitioner has to commit to a fixed clustering objective before even exploring the data. However, can one construct small data summaries for a wide range of clustering problems simultaneously? In this work, we affirmatively answer this question by proposing an efficient algorithm that constructs such one-shot summaries for k-clustering problems while retaining strong theoretical guarantees.
Submission history
From: Olivier Bachem [view email][v1] Mon, 27 Nov 2017 12:33:20 GMT (336kb,D)
[v2] Tue, 28 Nov 2017 15:31:27 GMT (336kb,D)
[v3] Tue, 20 Feb 2018 15:22:51 GMT (349kb,D)
Link back to: arXiv, form interface, contact.