Adaptive Encoding Strategies for Erasing-Based Lossless Floating-Point Compression
Abstract
Lossless floating-point time series compression is crucial for a wide range of critical scenarios. Nevertheless, it is a big challenge to compress time series losslessly due to the complex underlying layouts of floating-point values. The state-of-the-art erasing-based compression algorithm Elf demonstrates a rather impressive performance. We give an in-depth exploration of the encoding strategies of Elf, and find that there is still much room for improvement. In this paper, we propose Elf*, which employs a set of optimizations for leading zeros, center bits and sharing condition. Specifically, we develop a dynamic programming algorithm with a set of pruning strategies to compute the adaptive approximation rules efficiently. We theoretically prove that the adaptive approximation rules are globally optimal. We further extend Elf* to Streaming Elf*, i.e., SElf*, which achieves almost the same compression ratio as Elf*, while enjoying even higher efficiency in streaming scenarios. We compare Elf* and SElf* with 8 competitors using 22 datasets. The results demonstrate that SElf* achieves 9.2% relative compression ratio improvement over the best streaming competitor while maintaining similar efficiency, and that Elf* ranks among the most competitive batch compressors. All source codes are publicly released.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2023
- DOI:
- 10.48550/arXiv.2308.11915
- arXiv:
- arXiv:2308.11915
- Bibcode:
- 2023arXiv230811915L
- Keywords:
-
- Computer Science - Data Structures and Algorithms