We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Reliable Clustering of Bernoulli Mixture Models

Abstract: A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent Bernoulli dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we have analyzed the information-theoretic PAC-learnability of BMMs, when the number of clusters is unknown. In particular, we stipulate certain conditions on both sample complexity and the dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a given dataset. To the best of our knowledge, these findings are the first non-asymptotic (PAC) bounds on the sample complexity of learning BMMs.
Comments: 24 pages
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT); Machine Learning (stat.ML)
Cite as: arXiv:1710.02101 [cs.LG]
  (or arXiv:1710.02101v2 [cs.LG] for this version)

Submission history

From: Amir Najafi [view email]
[v1] Thu, 5 Oct 2017 16:22:27 GMT (95kb)
[v2] Sun, 16 Dec 2018 19:35:33 GMT (147kb)
[v3] Sun, 16 Jun 2019 04:55:27 GMT (92kb)

Link back to: arXiv, form interface, contact.