We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Mathematics > Statistics Theory

Title: Tensor Clustering with Planted Structures: Statistical Optimality and Computational Limits

Abstract: This paper studies the statistical and computational limits of high-order clustering with planted structures. We focus on two clustering models, constant high-order clustering (CHC) and rank-one higher-order clustering (ROHC), and study the methods and theories for testing whether a cluster exists (detection) and identifying the support of cluster (recovery).
Specifically, we identify sharp boundaries of signal-to-noise ratio for which CHC and ROHC detection/recovery are statistically possible. We also develop tight computational thresholds: when the signal-to-noise ratio is below these thresholds, we prove that polynomial-time algorithms cannot solve these problems under the computational hardness conjectures of hypergraphic planted clique (HPC) detection and hypergraphic planted dense subgraph (HPDS) recovery. We also propose polynomial-time tensor algorithms that achieve reliable detection and recovery when the signal-to-noise ratio is above these thresholds. Both sparsity and tensor structures yield the computational barriers in high-order tensor clustering. The interplay between them results in significant differences between high-order tensor clustering and matrix clustering in literature in aspects of statistical and computational phase transition diagrams, algorithmic approaches, hardness conjecture, and proof techniques. To our best knowledge, we are the first to give a thorough characterization of the statistical and computational trade-off for such a double computational-barrier problem. In addition, we also provide evidence for the computational hardness conjectures of HPC detection and HPDS recovery.
Subjects: Statistics Theory (math.ST); Computational Complexity (cs.CC); Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
Cite as: arXiv:2005.10743 [math.ST]
  (or arXiv:2005.10743v1 [math.ST] for this version)

Submission history

From: Anru R. Zhang [view email]
[v1] Thu, 21 May 2020 15:53:44 GMT (464kb,D)

Link back to: arXiv, form interface, contact.