Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity
Abstract
Sensitive statistics are often collected across sets of users, with repeated collection of reports done over time. For example, trends in users' private preferences or software usage may be monitored via such reports. We study the collection of such statistics in the local differential privacy (LDP) model, and describe an algorithm whose privacy cost is polylogarithmic in the number of changes to a user's value. More fundamentallyby building on anonymity of the users' reportswe also demonstrate how the privacy cost of our LDP algorithm can actually be much lower when viewed in the central model of differential privacy. We show, via a new and general privacy amplification technique, that any permutationinvariant algorithm satisfying $\varepsilon$local differential privacy will satisfy $(O(\varepsilon \sqrt{\log(1/\delta)/n}), \delta)$central differential privacy. By this, we explain how the high noise and $\sqrt{n}$ overhead of LDP protocols is a consequence of them being significantly more private in the central model. As a practical corollary, our results imply that several LDPbased industrial deployments may have much lower privacy cost than their advertised $\varepsilon$ would indicateat least if reports are anonymized.
 Publication:

arXiv eprints
 Pub Date:
 November 2018
 arXiv:
 arXiv:1811.12469
 Bibcode:
 2018arXiv181112469E
 Keywords:

 Computer Science  Machine Learning;
 Computer Science  Cryptography and Security;
 Computer Science  Data Structures and Algorithms;
 Statistics  Machine Learning
 EPrint:
 Stated amplification bounds for epsilon >