We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Convergence Rates of Latent Topic Models Under Relaxed Identifiability Conditions

Authors: Yining Wang
Abstract: In this paper we study the frequentist convergence rate for the Latent Dirichlet Allocation (Blei et al., 2003) topic models. We show that the maximum likelihood estimator converges to one of the finitely many equivalent parameters in Wasserstein's distance metric at a rate of $n^{-1/4}$ without assuming separability or non-degeneracy of the underlying topics and/or the existence of more than three words per document, thus generalizing the previous works of Anandkumar et al. (2012, 2014) from an information-theoretical perspective. We also show that the $n^{-1/4}$ convergence rate is optimal in the worst case.
Comments: 26 pages, 1 table. Added significantly more expositions, and a numerical procedure to check the order of degeneracy. Proofs slightly altered with explicit constants given at various places
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Journal reference: Electronic Journal of Statistics, 13(1):37-66, 2019
Cite as: arXiv:1710.11070 [stat.ML]
  (or arXiv:1710.11070v2 [stat.ML] for this version)

Submission history

From: Yining Wang [view email]
[v1] Mon, 30 Oct 2017 17:05:28 GMT (26kb)
[v2] Sat, 17 Mar 2018 21:35:57 GMT (35kb)

Link back to: arXiv, form interface, contact.