PushdownDB: Accelerating a DBMS using S3 Computation
Abstract
This paper studies the effectiveness of pushing parts of DBMS analytics queries into the Simple Storage Service (S3) engine of Amazon Web Services (AWS), using a recently released capability called S3 Select. We show that some DBMS primitives (filter, projection, aggregation) can always be cost-effectively moved into S3. Other more complex operations (join, top-K, group-by) require reimplementation to take advantage of S3 Select and are often candidates for pushdown. We demonstrate these capabilities through experimentation using a new DBMS that we developed, PushdownDB. Experimentation with a collection of queries including TPC-H queries shows that PushdownDB is on average 30% cheaper and 6.7X faster than a baseline that does not use S3 Select.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2020
- DOI:
- 10.48550/arXiv.2002.05837
- arXiv:
- arXiv:2002.05837
- Bibcode:
- 2020arXiv200205837Y
- Keywords:
-
- Computer Science - Databases