The extension of Pearson correlation coefficient, measuring noise, and selecting features
Abstract
Not a matter of serious contention, Pearson's correlation coefficient is still the most important statistical association measure. Restricted to just two variables, this measure sometimes doesn't live up to users' needs and expectations. Specifically, a multivariable version of the correlation coefficient can greatly contribute to better assessment of the risk in a multiasset investment portfolio. Needless to say, the correlation coefficient is derived from another concept: covariance. Even though covariance can be extended naturally by its mathematical formula, such an extension is to no use. Making matters worse, the correlation coefficient can never be extended based on its mathematical definition. In this article, we briefly explore random matrix theory to extend the notion of Pearson's correlation coefficient to an arbitrary number of variables. Then, we show that how useful this measure is at gauging noise, thereby selecting features particularly in classification.
 Publication:

arXiv eprints
 Pub Date:
 February 2024
 DOI:
 10.48550/arXiv.2402.00543
 arXiv:
 arXiv:2402.00543
 Bibcode:
 2024arXiv240200543S
 Keywords:

 Quantitative Finance  Mathematical Finance
 EPrint:
 24 pages, 40 figures