A log-linear time algorithm for constrained changepoint detection
Abstract
Changepoint detection is a central problem in time series and genomic data. For some applications, it is natural to impose constraints on the directions of changes. One example is ChIP-seq data, for which adding an up-down constraint improves peak detection accuracy, but makes the optimization problem more complicated. We show how a recently proposed functional pruning technique can be adapted to solve such constrained changepoint detection problems. This leads to a new algorithm which can solve problems with arbitrary affine constraints on adjacent segment means, and which has empirical time complexity that is log-linear in the amount of data. This algorithm achieves state-of-the-art accuracy in a benchmark of several genomic data sets, and is orders of magnitude faster than existing algorithms that have similar accuracy. Our implementation is available as the PeakSegPDPA function in the coseg R package, https://github.com/tdhock/coseg
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2017
- DOI:
- 10.48550/arXiv.1703.03352
- arXiv:
- arXiv:1703.03352
- Bibcode:
- 2017arXiv170303352D
- Keywords:
-
- Statistics - Computation;
- Quantitative Biology - Genomics;
- Statistics - Machine Learning