We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity

Abstract: The development of algorithms for unsupervised pattern recognition by nonlinear clustering is a notable problem in data science. Markov clustering (MCL) is a renowned algorithm that simulates stochastic flows on a network of sample similarities to detect the structural organization of clusters in the data, but it has never been generalized to deal with data nonlinearity. Minimum Curvilinearity (MC) is a principle that approximates nonlinear sample distances in the high-dimensional feature space by curvilinear distances, which are computed as transversal paths over their minimum spanning tree, and then stored in a kernel. Here we propose MC-MCL, which is the first nonlinear kernel extension of MCL and exploits Minimum Curvilinearity to enhance the performance of MCL in real and synthetic data with underlying nonlinear patterns. MC-MCL is compared with baseline clustering methods, including DBSCAN, K-means and affinity propagation. We find that Minimum Curvilinearity provides a valuable framework to estimate nonlinear distances also when its kernel is applied in combination with MCL. Indeed, MC-MCL overcomes classical MCL and even baseline clustering algorithms in different nonlinear datasets.
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:1912.12211 [cs.LG]
  (or arXiv:1912.12211v1 [cs.LG] for this version)

Submission history

From: Carlo Vittorio Cannistraci [view email]
[v1] Fri, 27 Dec 2019 16:07:23 GMT (686kb)

Link back to: arXiv, form interface, contact.