Learned Indexes for a Google-scale Disk-based Database
Abstract
There is great excitement about learned index structures, but understandable skepticism about the practicality of a new method uprooting decades of research on B-Trees. In this paper, we work to remove some of that uncertainty by demonstrating how a learned index can be integrated in a distributed, disk-based database system: Google's Bigtable. We detail several design decisions we made to integrate learned indexes in Bigtable. Our results show that integrating learned index significantly improves the end-to-end read latency and throughput for Bigtable.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2020
- DOI:
- 10.48550/arXiv.2012.12501
- arXiv:
- arXiv:2012.12501
- Bibcode:
- 2020arXiv201212501A
- Keywords:
-
- Computer Science - Databases;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- Computer Science - Machine Learning
- E-Print:
- 4 pages, Presented at Workshop on ML for Systems at NeurIPS 2020