PAPRIKA: Private Online False Discovery Rate Control
Abstract
In hypothesis testing, a false discovery occurs when a hypothesis is incorrectly rejected due to noise in the sample. When adaptively testing multiple hypotheses, the probability of a false discovery increases as more tests are performed. Thus the problem of False Discovery Rate (FDR) control is to find a procedure for testing multiple hypotheses that accounts for this effect in determining the set of hypotheses to reject. The goal is to minimize the number (or fraction) of false discoveries, while maintaining a high true positive rate (i.e., correct discoveries). In this work, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy for the sample. Unlike previous work in this direction, we focus on the online setting, meaning that a decision about each hypothesis must be made immediately after the test is performed, rather than waiting for the output of all tests as in the offline setting. We provide new private algorithms based on stateoftheart results in nonprivate online FDR control. Our algorithms have strong provable guarantees for privacy and statistical performance as measured by FDR and power. We also provide experimental results to demonstrate the efficacy of our algorithms in a variety of data environments.
 Publication:

arXiv eprints
 Pub Date:
 February 2020
 arXiv:
 arXiv:2002.12321
 Bibcode:
 2020arXiv200212321Z
 Keywords:

 Statistics  Machine Learning;
 Computer Science  Cryptography and Security;
 Computer Science  Data Structures and Algorithms;
 Computer Science  Machine Learning;
 Mathematics  Statistics Theory;
 Statistics  Methodology