References & Citations
Computer Science > Data Structures and Algorithms
Title: Separations for Estimating Large Frequency Moments on Data Streams
(Submitted on 8 May 2021 (v1), last revised 7 Jul 2022 (this version, v4))
Abstract: We study the classical problem of moment estimation of an underlying vector whose $n$ coordinates are implicitly defined through a series of updates in a data stream. We show that if the updates to the vector arrive in the random-order insertion-only model, then there exist space efficient algorithms with improved dependencies on the approximation parameter $\varepsilon$. In particular, for any real $p > 2$, we first obtain an algorithm for $F_p$ moment estimation using $\tilde{\mathcal{O}}\left(\frac{1}{\varepsilon^{4/p}}\cdot n^{1-2/p}\right)$ bits of memory. Our techniques also give algorithms for $F_p$ moment estimation with $p>2$ on arbitrary order insertion-only and turnstile streams, using $\tilde{\mathcal{O}}\left(\frac{1}{\varepsilon^{4/p}}\cdot n^{1-2/p}\right)$ bits of space and two passes, which is the first optimal multi-pass $F_p$ estimation algorithm up to $\log n$ factors. Finally, we give an improved lower bound of $\Omega\left(\frac{1}{\varepsilon^2}\cdot n^{1-2/p}\right)$ for one-pass insertion-only streams. Our results separate the complexity of this problem both between random and non-random orders, as well as one-pass and multi-pass streams.
Submission history
From: Samson Zhou [view email][v1] Sat, 8 May 2021 20:39:30 GMT (30kb)
[v2] Fri, 4 Jun 2021 03:32:41 GMT (85kb)
[v3] Fri, 11 Jun 2021 20:17:30 GMT (30kb)
[v4] Thu, 7 Jul 2022 07:00:46 GMT (30kb)
Link back to: arXiv, form interface, contact.