Fuzzy clustering of distribution-valued data using adaptive L2 Wasserstein distances

Irpino, Antonio; De Carvalho, Francisco; Verde, Rosanna

Full-text links:

Download:

Current browse context:

stat

< prev | next >

new | recent | 1605

Statistics > Machine Learning

Title: Fuzzy clustering of distribution-valued data using adaptive L2 Wasserstein distances

Authors: Antonio Irpino, Francisco De Carvalho, Rosanna Verde

(Submitted on 2 May 2016)

Abstract: Distributional (or distribution-valued) data are a new type of data arising from several sources and are considered as realizations of distributional variables. A new set of fuzzy c-means algorithms for data described by distributional variables is proposed.
The algorithms use the $L2$ Wasserstein distance between distributions as dissimilarity measures. Beside the extension of the fuzzy c-means algorithm for distributional data, and considering a decomposition of the squared $L2$ Wasserstein distance, we propose a set of algorithms using different automatic way to compute the weights associated with the variables as well as with their components, globally or cluster-wise. The relevance weights are computed in the clustering process introducing product-to-one constraints.
The relevance weights induce adaptive distances expressing the importance of each variable or of each component in the clustering process, acting also as a variable selection method in clustering. We have tested the proposed algorithms on artificial and real-world data. Results confirm that the proposed methods are able to better take into account the cluster structure of the data with respect to the standard fuzzy c-means, with non-adaptive distances.

Subjects:	Machine Learning (stat.ML)
MSC classes:	62A86, 62H30, 62G30
ACM classes:	G.3; I.5.1; H.3.3
Cite as:	arXiv:1605.00513 [stat.ML]
	(or arXiv:1605.00513v1 [stat.ML] for this version)

Submission history

From: Antonio Irpino PhD [view email]
[v1] Mon, 2 May 2016 14:56:18 GMT (579kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1605.00513

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Fuzzy clustering of distribution-valued data using adaptive L2 Wasserstein distances

Submission history