Optimal estimation in private distributed functional data analysis
Abstract
We systematically investigate the preservation of differential privacy in functional data analysis, beginning with functional mean estimation and extending to varying coefficient model estimation. Our work introduces a distributed learning framework involving multiple servers, each responsible for collecting several sparsely observed functions. This hierarchical setup introduces a mixed notion of privacy. Within each function, user-level differential privacy is applied to $m$ discrete observations. At the server level, central differential privacy is deployed to account for the centralised nature of data collection. Across servers, only private information is exchanged, adhering to federated differential privacy constraints. To address this complex hierarchy, we employ minimax theory to reveal several fundamental phenomena: from sparse to dense functional data analysis, from user-level to central and federated differential privacy costs, and the intricate interplay between different regimes of functional data analysis and privacy preservation. To the best of our knowledge, this is the first study to rigorously examine functional data estimation under multiple privacy constraints. Our theoretical findings are complemented by efficient private algorithms and extensive numerical evidence, providing a comprehensive exploration of this challenging problem.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- DOI:
- 10.48550/arXiv.2412.06582
- arXiv:
- arXiv:2412.06582
- Bibcode:
- 2024arXiv241206582X
- Keywords:
-
- Mathematics - Statistics Theory;
- Statistics - Methodology