A Communication Efficient Collaborative Learning Framework for Distributed Features
Abstract
We introduce a collaborative learning framework allowing multiple parties having different sets of attributes about the same user to jointly build models without exposing their raw data or model parameters. In particular, we propose a Federated Stochastic Block Coordinate Descent (FedBCD) algorithm, in which each party conducts multiple local updates before each communication to effectively reduce the number of communication rounds among parties, a principal bottleneck for collaborative learning problems. We analyze theoretically the impact of the number of local updates and show that when the batch size, sample size, and the local iterations are selected appropriately, within $T$ iterations, the algorithm performs $\mathcal{O}(\sqrt{T})$ communication rounds and achieves some $\mathcal{O}(1/\sqrt{T})$ accuracy (measured by the average of the gradient norm squared). The approach is supported by our empirical evaluations on a variety of tasks and datasets, demonstrating advantages over stochastic gradient descent (SGD) approaches.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2019
- DOI:
- 10.48550/arXiv.1912.11187
- arXiv:
- arXiv:1912.11187
- Bibcode:
- 2019arXiv191211187L
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Artificial Intelligence;
- Statistics - Machine Learning
- E-Print:
- This paper is published at the 2nd International Workshop on Federated Learning for Data Privacy and Confidentiality, in Conjunction with NeurIPS 2019 (FL-NeurIPS 19): https://nips.cc/Conferences/2019/ScheduleMultitrack?event=13202