Determinantal consensus clustering

Vicente, Serge; Murua, Alejandro

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 2102

Statistics > Machine Learning

Title: Determinantal consensus clustering

Authors: Serge Vicente, Alejandro Murua

(Submitted on 7 Feb 2021)

Abstract: Random restart of a given algorithm produces many partitions to yield a consensus clustering. Ensemble methods such as consensus clustering have been recognized as more robust approaches for data clustering than single clustering algorithms. We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms based on initial sets of center points, such as k-medoids or k-means. The relation between DPP and kernel-based methods makes DPPs suitable to describe and quantify similarity between objects. DPPs favor diversity of the center points within subsets. So, subsets with more similar points have less chances of being generated than subsets with very distinct points. The current and most popular sampling technique is sampling center points uniformly at random. We show through extensive simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets. These two properties of DPP are key to make DPPs achieve good performance with small ensembles. Simulations with artificial datasets and applications to real datasets show that determinantal consensus clustering outperform classical algorithms such as k-medoids and k-means consensus clusterings which are based on uniform random sampling of center points.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2102.03948 [stat.ML]
	(or arXiv:2102.03948v1 [stat.ML] for this version)

Submission history

From: Serge Vicente Vicente [view email]
[v1] Sun, 7 Feb 2021 23:48:24 GMT (277kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2102.03948

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Determinantal consensus clustering

Submission history