A Random Finite Set Model for Data Clustering

Phung, Dinh; Bo, Ba-Ngu

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 1703

Statistics > Machine Learning

Title: A Random Finite Set Model for Data Clustering

Authors: Dinh Phung, Ba-Ngu Bo

(Submitted on 14 Mar 2017)

Abstract: The goal of data clustering is to partition data points into groups to minimize a given objective function. While most existing clustering algorithms treat each data point as vector, in many applications each datum is not a vector but a point pattern or a set of points. Moreover, many existing clustering methods require the user to specify the number of clusters, which is not available in advance. This paper proposes a new class of models for data clustering that addresses set-valued data as well as unknown number of clusters, using a Dirichlet Process mixture of Poisson random finite sets. We also develop an efficient Markov Chain Monte Carlo posterior inference technique that can learn the number of clusters and mixture parameters automatically from the data. Numerical studies are presented to demonstrate the salient features of this new model, in particular its capacity to discover extremely unbalanced clusters in data.

Comments:	In Proceedings of International Conference on Fusion (FUSION), Salamanca, Spain, July 2014
Subjects:	Machine Learning (stat.ML)
Cite as:	arXiv:1703.04832 [stat.ML]
	(or arXiv:1703.04832v1 [stat.ML] for this version)

Submission history

From: Dinh Phung [view email]
[v1] Tue, 14 Mar 2017 23:35:57 GMT (3517kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1703.04832

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: A Random Finite Set Model for Data Clustering

Submission history