Sharp Bounds for Generalized Uniformity Testing
Abstract
We study the problem of generalized uniformity testing \cite{BC17} of a discrete probability distribution: Given samples from a probability distribution $p$ over an {\em unknown} discrete domain $\mathbf{\Omega}$, we want to distinguish, with probability at least $2/3$, between the case that $p$ is uniform on some {\em subset} of $\mathbf{\Omega}$ versus $\epsilon$-far, in total variation distance, from any such uniform distribution. We establish tight bounds on the sample complexity of generalized uniformity testing. In more detail, we present a computationally efficient tester whose sample complexity is optimal, up to constant factors, and a matching information-theoretic lower bound. Specifically, we show that the sample complexity of generalized uniformity testing is $\Theta\left(1/(\epsilon^{4/3}\|p\|_3) + 1/(\epsilon^{2} \|p\|_2) \right)$.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2017
- DOI:
- arXiv:
- arXiv:1709.02087
- Bibcode:
- 2017arXiv170902087D
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- Computer Science - Information Theory;
- Computer Science - Machine Learning;
- Mathematics - Statistics Theory