We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: Semi-Supervised Clustering of Sparse Graphs: Crossing the Information-Theoretic Threshold

Abstract: The stochastic block model is a canonical random graph model for clustering and community detection on network-structured data. Decades of extensive study on the problem have established many profound results, among which the phase transition at the Kesten-Stigum threshold is particularly interesting both from a mathematical and an applied standpoint. It states that no estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below certain threshold. Nevertheless, if we slightly extend the horizon to the ubiquitous semi-supervised setting, such a fundamental limitation will disappear completely. We prove that with arbitrary fraction of the labels revealed, the detection problem is feasible throughout the parameter domain. Moreover, we introduce two efficient algorithms, one combinatorial and one based on optimization, to integrate label information with graph structures. Our work brings a new perspective to stochastic model of networks and semidefinite program research.
Comments: 40 pages, 8 figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC); Probability (math.PR)
MSC classes: 60-08 (Primary) 90C35 (Secondary) 90C22
ACM classes: G.3; I.2.6
Cite as: arXiv:2205.11677 [stat.ML]
  (or arXiv:2205.11677v1 [stat.ML] for this version)

Submission history

From: Junda Sheng [view email]
[v1] Tue, 24 May 2022 00:03:25 GMT (1901kb,D)

Link back to: arXiv, form interface, contact.