Variational Inference with Tail-adaptive f-Divergence
Abstract
Variational inference with {\alpha}-divergences has been widely used in modern probabilistic machine learning. Compared to Kullback-Leibler (KL) divergence, a major advantage of using {\alpha}-divergences (with positive {\alpha} values) is their mass-covering property. However, estimating and optimizing {\alpha}-divergences require to use importance sampling, which could have extremely large or infinite variances due to heavy tails of importance weights. In this paper, we propose a new class of tail-adaptive f-divergences that adaptively change the convex function f with the tail of the importance weights, in a way that theoretically guarantees finite moments, while simultaneously achieving mass-covering properties. We test our methods on Bayesian neural networks, as well as deep reinforcement learning in which our method is applied to improve a recent soft actor-critic (SAC) algorithm. Our results show that our approach yields significant advantages compared with existing methods based on classical KL and {\alpha}-divergences.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2018
- DOI:
- 10.48550/arXiv.1810.11943
- arXiv:
- arXiv:1810.11943
- Bibcode:
- 2018arXiv181011943W
- Keywords:
-
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- NeurIPS 2018