Optimality of Frequency Moment Estimation
Abstract
Estimating the second frequency moment of a stream up to $(1\pm\varepsilon)$ multiplicative error requires at most $O(\log n / \varepsilon^2)$ bits of space, due to a seminal result of Alon, Matias, and Szegedy. It is also known that at least $\Omega(\log n + 1/\varepsilon^{2})$ space is needed. We prove an optimal lower bound of $\Omega\left(\log \left(n \varepsilon^2 \right) / \varepsilon^2\right)$ for all $\varepsilon = \Omega(1/\sqrt{n})$. Note that when $\varepsilon>n^{-1/2 + c}$, where $c>0$, our lower bound matches the classic upper bound of AMS. For smaller values of $\varepsilon$ we also introduce a revised algorithm that improves the classic AMS bound and matches our lower bound. Our lower bound holds also for the more general problem of $p$-th frequency moment estimation for the range of $p\in (1,2]$, giving a tight bound in the only remaining range to settle the optimal space complexity of estimating frequency moments.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2024
- DOI:
- arXiv:
- arXiv:2411.02148
- Bibcode:
- 2024arXiv241102148B
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- Computer Science - Computational Complexity;
- Computer Science - Information Theory