Coded Alternating Least Squares for Straggler Mitigation in Distributed Recommendations
Abstract
Matrix factorization is an important representation learning algorithm, e.g., recommender systems, where a large matrix can be factorized into the product of two low dimensional matrices termed as latent representations. This paper investigates the problem of matrix factorization in distributed computing systems with stragglers, those compute nodes that are slow to return computation results. A computation procedure, called coded Alternative Least Square (ALS), is proposed for mitigating the effect of stragglers in such systems. The coded ALS algorithm iteratively computes two low dimensional latent matrices by solving various linear equations, with the Entangled Polynomial Code (EPC) as a building block. We theoretically characterize the maximum number of stragglers that the algorithm can tolerate (or the recovery threshold) in relation to the redundancy of coding (or the code rate). In addition, we theoretically show the computation complexity for the coded ALS algorithm and conduct numerical experiments to validate our design.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2021
- DOI:
- arXiv:
- arXiv:2105.03631
- Bibcode:
- 2021arXiv210503631W
- Keywords:
-
- Computer Science - Information Theory
- E-Print:
- 11 pages