Interactive Byzantine-Resilient Gradient Coding for General Data Assignments
Abstract
We tackle the problem of Byzantine errors in distributed gradient descent within the Byzantine-resilient gradient coding framework. Our proposed solution can recover the exact full gradient in the presence of $s$ malicious workers with a data replication factor of only $s+1$. It generalizes previous solutions to any data assignment scheme that has a regular replication over all data samples. The scheme detects malicious workers through additional interactive communication and a small number of local computations at the main node, leveraging group-wise comparisons between workers with a provably optimal grouping strategy. The scheme requires at most $s$ interactive rounds that incur a total communication cost logarithmic in the number of data samples.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2024
- DOI:
- 10.48550/arXiv.2401.16915
- arXiv:
- arXiv:2401.16915
- Bibcode:
- 2024arXiv240116915J
- Keywords:
-
- Computer Science - Information Theory;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing