SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

doi:10.48550/arXiv.1910.06378

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

Federated Averaging (FedAvg) has emerged as the algorithm of choice for federated learning due to its simplicity and low communication cost. However, in spite of recent research efforts, its performance is not fully understood. We obtain tight convergence rates for FedAvg and prove that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence. As a solution, we propose a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the `client-drift' in its local updates. We prove that SCAFFOLD requires significantly fewer communication rounds and is not affected by data heterogeneity or client sampling. Further, we show that (for quadratics) SCAFFOLD can take advantage of similarity in the client's data yielding even faster convergence. The latter is the first result to quantify the usefulness of local-steps in distributed optimization.

Publication:

arXiv e-prints

Pub Date:

October 2019

DOI:

10.48550/arXiv.1910.06378

arXiv:

arXiv:1910.06378

Bibcode:

2019arXiv191006378P

Keywords:

Computer Science - Machine Learning;
Computer Science - Distributed;
Parallel;
and Cluster Computing;
Mathematics - Optimization and Control;
Statistics - Machine Learning;
68W40;
68W15;
90C25;
90C06;
G.1.6;
F.2.1;
E.4

E-Print:

v2 contains analysis of FedAvg, non-convex rates of Scaffold, and experimental evaluation. v3 fixes typos, ICML version. v4 slightly improves rate of SCAFFOLD for general convex functions

ADS

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

Abstract