Current browse context:
stat.ML
Change to browse by:
References & Citations
Statistics > Methodology
Title: PAC Mode Estimation using PPR Martingale Confidence Sequences
(Submitted on 10 Sep 2021 (v1), last revised 11 Apr 2022 (this version, v3))
Abstract: We consider the problem of correctly identifying the \textit{mode} of a discrete distribution $\mathcal{P}$ with sufficiently high probability by observing a sequence of i.i.d. samples drawn from $\mathcal{P}$. This problem reduces to the estimation of a single parameter when $\mathcal{P}$ has a support set of size $K = 2$. After noting that this special case is tackled very well by prior-posterior-ratio (PPR) martingale confidence sequences \citep{waudby-ramdas-ppr}, we propose a generalisation to mode estimation, in which $\mathcal{P}$ may take $K \geq 2$ values. To begin, we show that the "one-versus-one" principle to generalise from $K = 2$ to $K \geq 2$ classes is more efficient than the "one-versus-rest" alternative. We then prove that our resulting stopping rule, denoted PPR-1v1, is asymptotically optimal (as the mistake probability is taken to $0$). PPR-1v1 is parameter-free and computationally light, and incurs significantly fewer samples than competitors even in the non-asymptotic regime. We demonstrate its gains in two practical applications of sampling: election forecasting and verification of smart contracts in blockchains.
Submission history
From: Rohan Shah [view email][v1] Fri, 10 Sep 2021 18:11:38 GMT (1568kb,D)
[v2] Sun, 23 Jan 2022 18:43:11 GMT (3949kb,D)
[v3] Mon, 11 Apr 2022 10:35:43 GMT (4259kb,D)
Link back to: arXiv, form interface, contact.