A Direct $\tilde{O}(1/\epsilon)$ Iteration Parallel Algorithm for Optimal Transport
Abstract
Optimal transportation, or computing the Wasserstein or ``earth mover's'' distance between two distributions, is a fundamental primitive which arises in many learning and statistical settings. We give an algorithm which solves this problem to additive $\epsilon$ with $\tilde{O}(1/\epsilon)$ parallel depth, and $\tilde{O}\left(n^2/\epsilon\right)$ work. Barring a breakthrough on a long-standing algorithmic open problem, this is optimal for first-order methods. Blanchet et. al. '18, Quanrud '19 obtained similar runtimes through reductions to positive linear programming and matrix scaling. However, these reduction-based algorithms use complicated subroutines which may be deemed impractical due to requiring solvers for second-order iterations (matrix scaling) or non-parallelizability (positive LP). The fastest practical algorithms run in time $\tilde{O}(\min(n^2 / \epsilon^2, n^{2.5} / \epsilon))$ (Dvurechensky et. al. '18, Lin et. al. '19). We bridge this gap by providing a parallel, first-order, $\tilde{O}(1/\epsilon)$ iteration algorithm without worse dependence on dimension, and provide preliminary experimental evidence that our algorithm may enjoy improved practical performance. We obtain this runtime via a primal-dual extragradient method, motivated by recent theoretical improvements to maximum flow (Sherman '17).
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2019
- DOI:
- 10.48550/arXiv.1906.00618
- arXiv:
- arXiv:1906.00618
- Bibcode:
- 2019arXiv190600618J
- Keywords:
-
- Computer Science - Data Structures and Algorithms;
- Computer Science - Machine Learning;
- Mathematics - Optimization and Control;
- Statistics - Computation;
- Statistics - Machine Learning
- E-Print:
- 23 pages, 2 figures