Canonical correlation analysis of high-dimensional data with very small sample support
Abstract
This paper is concerned with the analysis of correlation between two high-dimensional data sets when there are only few correlated signal components but the number of samples is very small, possibly much smaller than the dimensions of the data. In such a scenario, a principal component analysis (PCA) rank-reduction preprocessing step is commonly performed before applying canonical correlation analysis (CCA). We present simple, yet very effective, approaches to the joint model-order selection of the number of dimensions that should be retained through the PCA step and the number of correlated signals. These approaches are based on reduced-rank versions of the Bartlett–Lawley hypothesis test and the minimum description length information-theoretic criterion. Simulation results show that the techniques perform well for very small sample sizes even in colored noise.
- Publication:
-
Signal Processing
- Pub Date:
- November 2016
- DOI:
- 10.1016/j.sigpro.2016.05.020
- arXiv:
- arXiv:1604.02047
- Bibcode:
- 2016SigPr.128..449S
- Keywords:
-
- Bartlett-Lawley statistic;
- Canonical correlation analysis;
- Model-order selection;
- Principal component analysis;
- Small sample support;
- Computer Science - Information Theory