Robust leave-one-out cross-validation for high-dimensional Bayesian models
Abstract
Leave-one-out cross-validation (LOO-CV) is a popular method for estimating out-of-sample predictive accuracy. However, computing LOO-CV criteria can be computationally expensive due to the need to fit the model multiple times. In the Bayesian context, importance sampling provides a possible solution but classical approaches can easily produce estimators whose asymptotic variance is infinite, making them potentially unreliable. Here we propose and analyze a novel mixture estimator to compute Bayesian LOO-CV criteria. Our method retains the simplicity and computational convenience of classical approaches, while guaranteeing finite asymptotic variance of the resulting estimators. Both theoretical and numerical results are provided to illustrate the improved robustness and efficiency. The computational benefits are particularly significant in high-dimensional problems, allowing to perform Bayesian LOO-CV for a broader range of models, and datasets with highly influential observations. The proposed methodology is easily implementable in standard probabilistic programming software and has a computational cost roughly equivalent to fitting the original model once.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2022
- DOI:
- 10.48550/arXiv.2209.09190
- arXiv:
- arXiv:2209.09190
- Bibcode:
- 2022arXiv220909190S
- Keywords:
-
- Statistics - Computation;
- Statistics - Methodology;
- Statistics - Machine Learning
- E-Print:
- doi:10.1080/01621459.2023.2257893