Parallel Balanced Allocations: The Heavily Loaded Case
Abstract
We study parallel algorithms for the classical ballsintobins problem, in which $m$ balls acting in parallel as separate agents are placed into $n$ bins. Algorithms operate in synchronous rounds, in each of which balls and bins exchange messages once. The goal is to minimize the maximal load over all bins using a small number of rounds and few messages. While the case of $m=n$ balls has been extensively studied, little is known about the heavily loaded case. In this work, we consider parallel algorithms for this somewhat neglected regime of $m\gg n$. The naive solution of allocating each ball to a bin chosen uniformly and independently at random results in maximal load $m/n+\Theta(\sqrt{m/n\cdot \log n})$ (for $m\geq n \log n$) w.h.p. In contrast, for the sequential setting Berenbrink et al (SIAM J. Comput 2006) showed that letting each ball join the least loaded bin of two randomly selected bins reduces the maximal load to $m/n+O(\log\log m)$ w.h.p. To date, no parallel variant of such a result is known. We present a simple parallel threshold algorithm that obtains a maximal load of $m/n+O(1)$ w.h.p. within $O(\log\log (m/n)+\log^* n)$ rounds. The algorithm is symmetric (balls and bins all "look the same"), and balls send $O(1)$ messages in expectation per round. The additive term of $O(\log^* n)$ in the complexity is known to be tight for such algorithms (Lenzen and Wattenhofer Distributed Computing 2016). We also prove that our analysis is tight, i.e., algorithms of the type we provide must run for $\Omega(\min\{\log\log (m/n),n\})$ rounds w.h.p. Finally, we give a simple asymmetric algorithm (i.e., balls are aware of a common labeling of the bins) that achieves a maximal load of $m/n + O(1)$ in a constant number of rounds w.h.p. Again, balls send only a single message per round, and bins receive $(1+o(1))m/n+O(\log n)$ messages w.h.p.
 Publication:

arXiv eprints
 Pub Date:
 April 2019
 arXiv:
 arXiv:1904.07532
 Bibcode:
 2019arXiv190407532L
 Keywords:

 Computer Science  Distributed;
 Parallel;
 and Cluster Computing
 EPrint:
 doi:10.1145/3323165.3323203