Accelerating XOR-based Erasure Coding using Program Optimization Techniques
Abstract
Erasure coding (EC) affords data redundancy for large-scale systems. XOR-based EC is an easy-to-implement method for optimizing EC. This paper addresses a significant performance gap between the state-of-the-art XOR-based EC approach (with 4.9 GB/s coding throughput) and Intel's high-performance EC library based on another approach (with 6.7 GB/s). We propose a novel approach based on our observation that XOR-based EC virtually generates programs of a Domain Specific Language for XORing byte arrays. We formalize such programs as straight-line programs (SLPs) of compiler construction and optimize SLPs using various optimization techniques. Our optimization flow is three-fold: 1) reducing operations using grammar compression algorithms; 2) reducing memory accesses using deforestation, a functional program optimization method; and 3) reducing cache misses using the (red-blue) pebble game of program analysis. We provide an experimental library, which outperforms Intel's library with 8.92 GB/s throughput.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2021
- DOI:
- 10.48550/arXiv.2108.02692
- arXiv:
- arXiv:2108.02692
- Bibcode:
- 2021arXiv210802692U
- Keywords:
-
- Computer Science - Programming Languages;
- Computer Science - Distributed;
- Parallel;
- and Cluster Computing;
- Computer Science - Performance
- E-Print:
- 18 pages. Author's version of a paper accepted at SC'21 https://sc21.supercomputing.org/