We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.PR

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Probability

Title: Exact recovery and sharp thresholds of Stochastic Ising Block Model

Authors: Min Ye
Abstract: The stochastic block model (SBM) is a random graph model in which the edges are generated according to the underlying cluster structure on the vertices. The (ferromagnetic) Ising model, on the other hand, assigns $\pm 1$ labels to vertices according to an underlying graph structure in a way that if two vertices are connected in the graph then they are more likely to be assigned the same label. In SBM, one aims to recover the underlying clusters from the graph structure while in Ising model, an extensively-studied problem is to recover the underlying graph structure based on i.i.d. samples (labelings of the vertices).
In this paper, we propose a natural composition of SBM and the Ising model, which we call the Stochastic Ising Block Model (SIBM). In SIBM, we take SBM in its simplest form, where $n$ vertices are divided into two equal-sized clusters and the edges are connected independently with probability $p$ within clusters and $q$ across clusters. Then we use the graph $G$ generated by the SBM as the underlying graph of the Ising model and draw $m$ i.i.d. samples from it. The objective is to exactly recover the two clusters in SBM from the samples generated by the Ising model, without observing the graph $G$. As the main result of this paper, we establish a sharp threshold $m^\ast$ on the sample complexity of this exact recovery problem in a properly chosen regime, where $m^\ast$ can be calculated from the parameters of SIBM. We show that when $m\ge m^\ast$, one can recover the clusters from $m$ samples in $O(n)$ time as the number of vertices $n$ goes to infinity. When $m<m^\ast$, we further show that for almost all choices of parameters of SIBM, the success probability of any recovery algorithms approaches $0$ as $n\to\infty$.
Comments: Fixed a gap in the original proof of Theorem 5. The new proof of Theorem 5 relies on Lemma 5, which is the main new element in this version
Subjects: Probability (math.PR); Information Theory (cs.IT); Machine Learning (stat.ML)
Cite as: arXiv:2004.05944 [math.PR]
  (or arXiv:2004.05944v3 [math.PR] for this version)

Submission history

From: Min Ye [view email]
[v1] Mon, 13 Apr 2020 13:59:13 GMT (42kb)
[v2] Mon, 13 Jul 2020 02:52:37 GMT (42kb)
[v3] Wed, 14 Oct 2020 07:21:59 GMT (43kb)

Link back to: arXiv, form interface, contact.