We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Fast Randomized Semi-Supervised Clustering

Abstract: We consider the problem of clustering partially labeled data from a minimal number of randomly chosen pairwise comparisons between the items. We introduce an efficient local algorithm based on a power iteration of the non-backtracking operator and study its performance on a simple model. For the case of two clusters, we give bounds on the classification error and show that a small error can be achieved from $O(n)$ randomly chosen measurements, where $n$ is the number of items in the dataset. Our algorithm is therefore efficient both in terms of time and space complexities. We also investigate numerically the performance of the algorithm on synthetic and real world data.
Subjects: Machine Learning (cs.LG); Probability (math.PR); Statistics Theory (math.ST); Machine Learning (stat.ML)
Journal reference: Journal of Physics: Conf. Series 1036 (2018) 012015
DOI: 10.1088/1742-6596/1036/1/012015
Cite as: arXiv:1605.06422 [cs.LG]
  (or arXiv:1605.06422v3 [cs.LG] for this version)

Submission history

From: Alaa Saade [view email]
[v1] Fri, 20 May 2016 16:21:13 GMT (376kb,D)
[v2] Mon, 23 May 2016 16:26:16 GMT (376kb,D)
[v3] Sun, 9 Oct 2016 07:45:16 GMT (920kb,D)

Link back to: arXiv, form interface, contact.