Determining Principal Component Cardinality through the Principle of Minimum Description Length
Abstract
PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2018
- DOI:
- 10.48550/arXiv.1901.00059
- arXiv:
- arXiv:1901.00059
- Bibcode:
- 2019arXiv190100059T
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- LOD 2019