Power analysis of knockoff filters for correlated designs
Abstract
The knockoff filter introduced by Barber and Candès 2016 is an elegant framework for controlling the false discovery rate in variable selection. While empirical results indicate that this methodology is not too conservative, there is no conclusive theoretical result on its power. When the predictors are i.i.d. Gaussian, it is known that as the signal to noise ratio tend to infinity, the knockoff filter is consistent in the sense that one can make FDR go to 0 and power go to 1 simultaneously. In this work we study the case where the predictors have a general covariance matrix $\Sigma$. We introduce a simple functional called effective signal deficiency (ESD) of the covariance matrix $\Sigma$ that predicts consistency of various variable selection methods. In particular, ESD reveals that the structure of the precision matrix $\Sigma^{1}$ plays a central role in consistency and therefore, so does the conditional independence structure of the predictors. To leverage this connection, we introduce Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse). Our theoretical results are supported by numerical evidence on synthetic data.
 Publication:

arXiv eprints
 Pub Date:
 October 2019
 arXiv:
 arXiv:1910.12428
 Bibcode:
 2019arXiv191012428L
 Keywords:

 Mathematics  Statistics Theory;
 Computer Science  Information Theory;
 Computer Science  Machine Learning;
 Statistics  Machine Learning
 EPrint:
 Accepted to Neurips 2019. The conference version includes the contents of this version excluding the appendices. v3 on arXiv corrected some typos in v2