We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions

Abstract: We address general-shaped clustering problems under very weak parametric assumptions with a two-step hybrid robust clustering algorithm based on trimmed k-means and hierarchical agglomeration. The algorithm has low computational complexity and effectively identifies the clusters also in presence of data contamination. We also present natural generalizations of the approach as well as an adaptive procedure to estimate the amount of contamination in a data-driven fashion. Our proposal outperforms state-of-the-art robust, model-based methods in our numerical simulations and real-world applications related to color quantization for image analysis, human mobility patterns based on GPS data, biomedical images of diabetic retinopathy, and functional data across weather stations.
Comments: 19 pages, 16 figures
Subjects: Methodology (stat.ME); Applications (stat.AP); Machine Learning (stat.ML)
Cite as: arXiv:2201.06391 [stat.ME]
  (or arXiv:2201.06391v1 [stat.ME] for this version)

Submission history

From: Luca Insolia [view email]
[v1] Mon, 17 Jan 2022 13:05:05 GMT (3650kb)

Link back to: arXiv, form interface, contact.