Provably Efficient Posterior Sampling for Sparse Linear Regression via Measure Decomposition
Abstract
We consider the problem of sampling from the posterior distribution of a $d$-dimensional coefficient vector $\boldsymbol{\theta}$, given linear observations $\boldsymbol{y} = \boldsymbol{X}\boldsymbol{\theta}+\boldsymbol{\varepsilon}$. In general, such posteriors are multimodal, and therefore challenging to sample from. This observation has prompted the exploration of various heuristics that aim at approximating the posterior distribution. In this paper, we study a different approach based on decomposing the posterior distribution into a log-concave mixture of simple product measures. This decomposition allows us to reduce sampling from a multimodal distribution of interest to sampling from a log-concave one, which is tractable and has been investigated in detail. We prove that, under mild conditions on the prior, for random designs, such measure decomposition is generally feasible when the number of samples per parameter $n/d$ exceeds a constant threshold. We thus obtain a provably efficient (polynomial time) sampling algorithm in a regime where this was previously not known. Numerical simulations confirm that the algorithm is practical, and reveal that it has attractive statistical properties compared to state-of-the-art methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2024
- DOI:
- 10.48550/arXiv.2406.19550
- arXiv:
- arXiv:2406.19550
- Bibcode:
- 2024arXiv240619550M
- Keywords:
-
- Statistics - Methodology;
- Mathematics - Statistics Theory
- E-Print:
- 29 pages, 10 figures