Private False Discovery Rate Control
Abstract
We provide the first differentially private algorithms for controlling the false discovery rate (FDR) in multiple hypothesis testing, with essentially no loss in power under certain conditions. Our general approach is to adapt a wellknown variant of the BenjaminiHochberg procedure (BHq), making each step differentially private. This destroys the classical proof of FDR control. To prove FDR control of our method, (a) we develop a new proof of the original (nonprivate) BHq algorithm and its robust variants  a proof requiring only the assumption that the true null test statistics are independent, allowing for arbitrary correlations between the true nulls and false nulls. This assumption is fairly weak compared to those previously shown in the vast literature on this topic, and explains in part the empirical robustness of BHq. Then (b) we relate the FDR control properties of the differentially private version to the control properties of the nonprivate version. \end{enumerate} We also present a lowdistortion "oneshot" differentially private primitive for "top $k$" problems, e.g., "Which are the $k$ most popular hobbies?" (which we apply to: "Which hypotheses have the $k$ most significant $p$values?"), and use it to get a faster privacypreserving instantiation of our general approach at little cost in accuracy. The proof of privacy for the oneshot top~$k$ algorithm introduces a new technique of independent interest.
 Publication:

arXiv eprints
 Pub Date:
 November 2015
 arXiv:
 arXiv:1511.03803
 Bibcode:
 2015arXiv151103803D
 Keywords:

 Mathematics  Statistics Theory;
 Computer Science  Data Structures and Algorithms;
 Statistics  Machine Learning