Optimal ChangePoint Detection and Localization
Abstract
Given a times series ${\bf Y}$ in $\mathbb{R}^n$, with a piecewise contant mean and independent components, the twin problems of changepoint detection and changepoint localization respectively amount to detecting the existence of times where the mean varies and estimating the positions of those changepoints. In this work, we tightly characterize optimal rates for both problems and uncover the phase transition phenomenon from a global testing problem to a local estimation problem. Introducing a suitable definition of the energy of a changepoint, we first establish in the single changepoint setting that the optimal detection threshold is $\sqrt{2\log\log(n)}$. When the energy is just above the detection threshold, then the problem of localizing the changepoint becomes purely parametric: it only depends on the difference in means and not on the position of the changepoint anymore. Interestingly, for most changepoint positions, it is possible to detect and localize them at a much smaller energy level. In the multiple changepoint setting, we establish the energy detection threshold and show similarly that the optimal localization error of a specific changepoint becomes purely parametric. Along the way, tight optimal rates for Hausdorff and $l_1$ estimation losses of the vector of all changepoints positions are also established. Two procedures achieving these optimal rates are introduced. The first one is a leastsquares estimator with a new multiscale penalty that favours well spread changepoints. The second one is a twostep multiscale postprocessing procedure whose computational complexity can be as low as $O(n\log(n))$. Notably, these two procedures accommodate with the presence of possibly many lowenergy and therefore undetectable changepoints and are still able to detect and localize highenergy changepoints even with the presence of those nuisance parameters.
 Publication:

arXiv eprints
 Pub Date:
 October 2020
 DOI:
 10.48550/arXiv.2010.11470
 arXiv:
 arXiv:2010.11470
 Bibcode:
 2020arXiv201011470V
 Keywords:

 Mathematics  Statistics Theory;
 Statistics  Methodology
 EPrint:
 73 pages