A Direct $\tilde{O}(1/\epsilon)$ Iteration Parallel Algorithm for Optimal Transport

doi:10.48550/arXiv.1906.00618

A Direct $\tilde{O}(1/\epsilon)$ Iteration Parallel Algorithm for Optimal Transport

Optimal transportation, or computing the Wasserstein or ``earth mover's'' distance between two distributions, is a fundamental primitive which arises in many learning and statistical settings. We give an algorithm which solves this problem to additive $\epsilon$ with $\tilde{O}(1/\epsilon)$ parallel depth, and $\tilde{O}\left(n^2/\epsilon\right)$ work. Barring a breakthrough on a long-standing algorithmic open problem, this is optimal for first-order methods. Blanchet et. al. '18, Quanrud '19 obtained similar runtimes through reductions to positive linear programming and matrix scaling. However, these reduction-based algorithms use complicated subroutines which may be deemed impractical due to requiring solvers for second-order iterations (matrix scaling) or non-parallelizability (positive LP). The fastest practical algorithms run in time $\tilde{O}(\min(n^2 / \epsilon^2, n^{2.5} / \epsilon))$ (Dvurechensky et. al. '18, Lin et. al. '19). We bridge this gap by providing a parallel, first-order, $\tilde{O}(1/\epsilon)$ iteration algorithm without worse dependence on dimension, and provide preliminary experimental evidence that our algorithm may enjoy improved practical performance. We obtain this runtime via a primal-dual extragradient method, motivated by recent theoretical improvements to maximum flow (Sherman '17).

Publication:

arXiv e-prints

Pub Date:

June 2019

DOI:

10.48550/arXiv.1906.00618

arXiv:

arXiv:1906.00618

Bibcode:

2019arXiv190600618J

Keywords:

Computer Science - Data Structures and Algorithms;
Computer Science - Machine Learning;
Mathematics - Optimization and Control;
Statistics - Computation;
Statistics - Machine Learning

E-Print:

23 pages, 2 figures

NASA/ADS

A Direct $\tilde{O}(1/\epsilon)$ Iteration Parallel Algorithm for Optimal Transport

Abstract