We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.ST

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Mathematics > Statistics Theory

Title: Estimation and Concentration of Missing Mass of Functions of Discrete Probability Distributions

Abstract: Given a positive function $g$ from $[0,1]$ to the reals, the function's missing mass in a sequence of iid samples, defined as the sum of $g(pr(x))$ over the missing letters $x$, is introduced and studied. The missing mass of a function generalizes the classical missing mass, and has several interesting connections to other related estimation problems. Minimax estimation is studied for order-$\alpha$ missing mass ($g(p)=p^{\alpha}$) for both integer and non-integer values of $\alpha$. Exact minimax convergence rates are obtained for the integer case. Concentration is studied for a class of functions and specific results are derived for order-$\alpha$ missing mass and missing Shannon entropy ($g(p)=-p\log p$). Sub-Gaussian tail bounds with near-optimal worst-case variance factors are derived. Two new notions of concentration, named strongly sub-Gamma and filtered sub-Gaussian concentration, are introduced and shown to result in right tail bounds that are better than those obtained from sub-Gaussian concentration.
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT)
Cite as: arXiv:2110.01968 [math.ST]
  (or arXiv:2110.01968v1 [math.ST] for this version)

Submission history

From: Prafulla Chandra Mr [view email]
[v1] Tue, 5 Oct 2021 11:57:55 GMT (38kb)

Link back to: arXiv, form interface, contact.