References & Citations
Computer Science > Data Structures and Algorithms
Title: A New Coreset Framework for Clustering
(Submitted on 13 Apr 2021 (v1), last revised 29 Jul 2022 (this version, v4))
Abstract: Given a metric space, the $(k,z)$-clustering problem consists of finding $k$ centers such that the sum of the of distances raised to the power $z$ of every point to its closest center is minimized. This encapsulates the famous $k$-median ($z=1$) and $k$-means ($z=2$) clustering problems. Designing small-space sketches of the data that approximately preserves the cost of the solutions, also known as \emph{coresets}, has been an important research direction over the last 15 years.
In this paper, we present a new, simple coreset framework that simultaneously improves upon the best known bounds for a large variety of settings, ranging from Euclidean space, doubling metric, minor-free metric, and the general metric cases.
Submission history
From: David Saulpic [view email][v1] Tue, 13 Apr 2021 12:15:36 GMT (171kb,D)
[v2] Tue, 18 May 2021 17:43:56 GMT (211kb,D)
[v3] Sun, 5 Dec 2021 14:45:46 GMT (229kb,D)
[v4] Fri, 29 Jul 2022 14:46:20 GMT (185kb,D)
Link back to: arXiv, form interface, contact.