Accurate and Efficient Time Series Matching by Season- and Trend-aware Symbolic Approximation -- Extended Version Including Additional Evaluation and Proofs
Abstract
Processing and analyzing time series data\-sets have become a central issue in many domains requiring data management systems to support time series as a native data type. A crucial prerequisite of these systems is time series matching, which still is a challenging problem. A time series is a high-dimensional data type, its representation is storage-, and its comparison is time-consuming. Among the representation techniques that tackle these challenges, the symbolic aggregate approximation (SAX) is the current state of the art. This technique reduces a time series to a low-dimensional space by segmenting it and discretizing each segment into a small symbolic alphabet. However, SAX ignores the deterministic behavior of time series such as cyclical repeating patterns or trend component affecting all segments and leading to a distortion of the symbolic distribution. In this paper, we present a season- and a trend-aware symbolic approximation. We show that this improves the symbolic distribution and increase the representation accuracy without increasing its memory footprint. Most importantly, this enables a more efficient time series matching by providing a match up to three orders of magnitude faster than SAX.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2021
- DOI:
- 10.48550/arXiv.2105.14867
- arXiv:
- arXiv:2105.14867
- Bibcode:
- 2021arXiv210514867K
- Keywords:
-
- Computer Science - Databases