Generalised learning of time-series: Ornstein-Uhlenbeck processes
Abstract
In machine learning, statistics, econometrics and statistical physics, cross-validation (CV) is used asa standard approach in quantifying the generalisation performance of a statistical model. A directapplication of CV in time-series leads to the loss of serial correlations, a requirement of preserving anynon-stationarity and the prediction of the past data using the future data. In this work, we proposea meta-algorithm called reconstructive cross validation (rCV ) that avoids all these issues. At first,k folds are formed with non-overlapping randomly selected subsets of the original time-series. Then,we generate k new partial time-series by removing data points from a given fold: every new partialtime-series have missing points at random from a different entire fold. A suitable imputation or asmoothing technique is used to reconstruct k time-series. We call these reconstructions secondarymodels. Thereafter, we build the primary k time-series models using new time-series coming fromthe secondary models. The performance of the primary models are evaluated simultaneously bycomputing the deviations from the originally removed data points and out-of-sample (OSS) data.Full cross-validation in time-series models can be practiced with rCV along with generating learning curves.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2019
- DOI:
- 10.48550/arXiv.1910.09394
- arXiv:
- arXiv:1910.09394
- Bibcode:
- 2019arXiv191009394S
- Keywords:
-
- Statistics - Machine Learning;
- Condensed Matter - Statistical Mechanics;
- Computer Science - Machine Learning;
- Statistics - Methodology;
- 37M10;
- 62M10;
- 62P35;
- 68T05;
- 68Q32;
- G.3;
- J.2
- E-Print:
- 7 pages, 4 figures