Quantile-based fuzzy C-means clustering of multivariate time series: Robust techniques
Abstract
Three robust methods for clustering multivariate time series from the point of view of generating processes are proposed. The procedures are robust versions of a fuzzy C-means model based on: (i) estimates of the quantile cross-spectral density and (ii) the classical principal component analysis. Robustness to the presence of outliers is achieved by using the so-called metric, noise and trimmed approaches. The metric approach incorporates in the objective function a distance measure aimed at neutralizing the effect of the outliers, the noise approach builds an artificial cluster expected to contain the outlying series and the trimmed approach eliminates the most atypical series in the dataset. All the proposed techniques inherit the nice properties of the quantile cross-spectral density, as being able to uncover general types of dependence. Results from a broad simulation study including multivariate linear, nonlinear and GARCH processes indicate that the algorithms are substantially effective in coping with the presence of outlying series (i.e., series exhibiting a dependence structure different from that of the majority), clearly poutperforming alternative procedures. The usefulness of the suggested methods is highlighted by means of two specific applications regarding financial and environmental series.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2021
- DOI:
- arXiv:
- arXiv:2109.11027
- Bibcode:
- 2021arXiv210911027L
- Keywords:
-
- Statistics - Methodology;
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- arXiv admin note: text overlap with arXiv:2109.03728