Optimally-Weighted Herding is Bayesian Quadrature
Abstract
Herding and kernel herding are deterministic methods of choosing samples which summarise a probability distribution. A related task is choosing samples for estimating integrals using Bayesian quadrature. We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature. We then show that sequential Bayesian quadrature can be viewed as a weighted version of kernel herding which achieves performance superior to any other weighted herding method. We demonstrate empirically a rate of convergence faster than O(1/N). Our results also imply an upper bound on the empirical error of the Bayesian quadrature estimate.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2012
- DOI:
- 10.48550/arXiv.1204.1664
- arXiv:
- arXiv:1204.1664
- Bibcode:
- 2012arXiv1204.1664H
- Keywords:
-
- Statistics - Machine Learning;
- Mathematics - Numerical Analysis;
- G.1.4
- E-Print:
- Accepted as an oral presentation at Uncertainty in Artificial Intelligence 2012. Updated to fix several typos