Uncertainty-Aware Principal Component Analysis

doi:10.48550/arXiv.1905.01127

Uncertainty-Aware Principal Component Analysis

We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to non-linear methods, linear dimensionality reduction techniques have the advantage that the characteristics of such probability distributions remain intact after projection. We derive a representation of the PCA sample covariance matrix that respects potential uncertainty in each of the inputs, building the mathematical foundation of our new method: uncertainty-aware PCA. In addition to the accuracy and performance gained by our approach over sampling-based strategies, our formulation allows us to perform sensitivity analysis with regard to the uncertainty in the data. For this, we propose factor traces as a novel visualization that enables to better understand the influence of uncertainty on the chosen principal components. We provide multiple examples of our technique using real-world datasets. As a special case, we show how to propagate multivariate normal distributions through PCA in closed form. Furthermore, we discuss extensions and limitations of our approach.

Publication:

arXiv e-prints

Pub Date:

May 2019

DOI:

10.48550/arXiv.1905.01127

arXiv:

arXiv:1905.01127

Bibcode:

2019arXiv190501127G

Keywords:

Computer Science - Machine Learning;
Computer Science - Human-Computer Interaction;
Statistics - Machine Learning

E-Print:

IEEE Transactions on Visualization and Computer Graphics, 2020

NASA/ADS

Uncertainty-Aware Principal Component Analysis

Abstract