We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Revisiting Spectral Graph Clustering with Generative Community Models

Abstract: The methodology of community detection can be divided into two principles: imposing a network model on a given graph, or optimizing a designed objective function. The former provides guarantees on theoretical detectability but falls short when the graph is inconsistent with the underlying model. The latter is model-free but fails to provide quality assurance for the detected communities. In this paper, we propose a novel unified framework to combine the advantages of these two principles. The presented method, SGC-GEN, not only considers the detection error caused by the corresponding model mismatch to a given graph, but also yields a theoretical guarantee on community detectability by analyzing Spectral Graph Clustering (SGC) under GENerative community models (GCMs). SGC-GEN incorporates the predictability on correct community detection with a measure of community fitness to GCMs. It resembles the formulation of supervised learning problems by enabling various community detection loss functions and model mismatch metrics. We further establish a theoretical condition for correct community detection using the normalized graph Laplacian matrix under a GCM, which provides a novel data-driven loss function for SGC-GEN. In addition, we present an effective algorithm to implement SGC-GEN, and show that the computational complexity of SGC-GEN is comparable to the baseline methods. Our experiments on 18 real-world datasets demonstrate that SGC-GEN possesses superior and robust performance compared to 6 baseline methods under 7 representative clustering metrics.
Comments: Accepted by IEEE International Conference on Data Mining (ICDM) 2017 as a regular paper - full paper with supplementary material
Subjects: Machine Learning (stat.ML); Social and Information Networks (cs.SI)
Cite as: arXiv:1709.04594 [stat.ML]
  (or arXiv:1709.04594v2 [stat.ML] for this version)

Submission history

From: Pin-Yu Chen [view email]
[v1] Thu, 14 Sep 2017 02:34:30 GMT (467kb)
[v2] Thu, 5 Oct 2017 04:03:14 GMT (467kb)

Link back to: arXiv, form interface, contact.