Detection of Correlated Random Vectors
Abstract
In this paper, we investigate the problem of deciding whether two standard normal random vectors $\mathsf{X}\in\mathbb{R}^{n}$ and $\mathsf{Y}\in\mathbb{R}^{n}$ are correlated or not. This is formulated as a hypothesis testing problem, where under the null hypothesis, these vectors are statistically independent, while under the alternative, $\mathsf{X}$ and a randomly and uniformly permuted version of $\mathsf{Y}$, are correlated with correlation $\rho$. We analyze the thresholds at which optimal testing is information-theoretically impossible and possible, as a function of $n$ and $\rho$. To derive our information-theoretic lower bounds, we develop a novel technique for evaluating the second moment of the likelihood ratio using an orthogonal polynomials expansion, which among other things, reveals a surprising connection to integer partition functions. We also study a multi-dimensional generalization of the above setting, where rather than two vectors we observe two databases/matrices, and furthermore allow for partial correlations between these two.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2024
- DOI:
- arXiv:
- arXiv:2401.13429
- Bibcode:
- 2024arXiv240113429E
- Keywords:
-
- Computer Science - Information Theory;
- Computer Science - Machine Learning;
- Mathematics - Statistics Theory
- E-Print:
- 42 pages