Parallel Query Processing with Heterogeneous Machines
Abstract
We study the problem of computing a full Conjunctive Query in parallel using $p$ heterogeneous machines. Our computational model is similar to the MPC model, but each machine has its own cost function mapping from the number of bits it receives to a cost. An optimal algorithm should minimize the maximum cost across all machines. We consider algorithms over a single communication round and give a lower bound and matching upper bound for databases where each relation has the same cardinality. We do this for both linear cost functions like in previous work, but also for more general cost functions. For databases with relations of different cardinalities, we also find a lower bound, and give matching upper bounds for specific queries like the cartesian product, the join, the star query, and the triangle query. Our approach is inspired by the HyperCube algorithm, but there are additional challenges involved when machines have heterogeneous cost functions.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2025
- arXiv:
- arXiv:2501.08896
- Bibcode:
- 2025arXiv250108896F
- Keywords:
-
- Computer Science - Databases
- E-Print:
- This is the full version of a paper accepted to ICDT 2025