Current browse context:
stat.ML
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Fast Randomized Semi-Supervised Clustering
(Submitted on 20 May 2016 (v1), last revised 9 Oct 2016 (this version, v3))
Abstract: We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be achieved from $O(n)$ randomly chosen measurements, where $n$ is the number of items in the dataset. Our algorithm is therefore efficient both in terms of time and space complexities. We also investigate numerically the performance of the algorithm on synthetic and real world data.
Submission history
From: Alaa Saade [view email][v1] Fri, 20 May 2016 16:21:13 GMT (376kb,D)
[v2] Mon, 23 May 2016 16:26:16 GMT (376kb,D)
[v3] Sun, 9 Oct 2016 07:45:16 GMT (920kb,D)
Link back to: arXiv, form interface, contact.