Structure preservation via the Wasserstein distance
Abstract
We show that under minimal assumptions on a random vector $X\in\mathbb{R}^d$ and with high probability, given $m$ independent copies of $X$, the coordinate distribution of each vector $(\langle X_i,\theta \rangle)_{i=1}^m$ is dictated by the distribution of the true marginal $\langle X,\theta \rangle$. Specifically, we show that with high probability, \[\sup_{\theta \in S^{d-1}} \left( \frac{1}{m}\sum_{i=1}^m \left|\langle X_i,\theta \rangle^\sharp - \lambda^\theta_i \right|^2 \right)^{1/2} \leq c \left( \frac{d}{m} \right)^{1/4},\] where $\lambda^{\theta}_i = m\int_{(\frac{i-1}{m}, \frac{i}{m}]} F_{ \langle X,\theta \rangle }^{-1}(u)\,du$ and $a^\sharp$ denotes the monotone non-decreasing rearrangement of $a$. Moreover, this estimate is optimal. The proof follows from a sharp estimate on the worst Wasserstein distance between a marginal of $X$ and its empirical counterpart, $\frac{1}{m} \sum_{i=1}^m \delta_{\langle X_i, \theta \rangle}$.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2022
- DOI:
- arXiv:
- arXiv:2209.07058
- Bibcode:
- 2022arXiv220907058B
- Keywords:
-
- Mathematics - Statistics Theory;
- Mathematics - Functional Analysis;
- Mathematics - Probability
- E-Print:
- Original paper [v1] was split into two papers. Here is the first part. Second part is now called "Optimal non-gaussian Dvoretzky-Milman embeddings"