An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums
Abstract
Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced stochastic algorithms when run on a single machine, and are therefore not efficient. Centralized algorithms are fast, but their scaling is limited by global aggregation steps that result in communication bottlenecks. In this work, we propose an efficient \textbf{A}ccelerated \textbf{D}ecentralized stochastic algorithm for \textbf{F}inite \textbf{S}ums named ADFS, which uses local stochastic proximal updates and randomized pairwise communications between nodes. On $n$ machines, ADFS learns from $nm$ samples in the same time it takes optimal algorithms to learn from $m$ samples on one machine. This scaling holds until a critical network size is reached, which depends on communication delays, on the number of samples $m$, and on the network topology. We provide a theoretical analysis based on a novel augmented graph approach combined with a precise evaluation of synchronization times and an extension of the accelerated proximal coordinate gradient algorithm to arbitrary sampling. We illustrate the improvement of ADFS over state-of-the-art decentralized approaches with experiments.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2019
- DOI:
- 10.48550/arXiv.1905.11394
- arXiv:
- arXiv:1905.11394
- Bibcode:
- 2019arXiv190511394H
- Keywords:
-
- Mathematics - Optimization and Control;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing
- E-Print:
- Code available in source files. arXiv admin note: substantial text overlap with arXiv:1901.09865