References & Citations
Statistics > Machine Learning
Title: Practical Coreset Constructions for Machine Learning
(Submitted on 19 Mar 2017 (v1), last revised 4 Jun 2017 (this version, v2))
Abstract: We investigate coresets - succinct, small summaries of large data sets - so that solutions found on the summary are provably competitive with solution found on the full data set. We provide an overview over the state-of-the-art in coreset construction for machine learning. In Section 2, we present both the intuition behind and a theoretically sound framework to construct coresets for general problems and apply it to $k$-means clustering. In Section 3 we summarize existing coreset construction algorithms for a variety of machine learning problems such as maximum likelihood estimation of mixture models, Bayesian non-parametric models, principal component analysis, regression and general empirical risk minimization.
Submission history
From: Mario Lucic [view email][v1] Sun, 19 Mar 2017 17:45:29 GMT (490kb,D)
[v2] Sun, 4 Jun 2017 22:40:16 GMT (508kb,D)
Link back to: arXiv, form interface, contact.