DVAE++: Discrete Variational Autoencoders with Overlapping Transformations

doi:10.48550/arXiv.1802.04920

DVAE++: Discrete Variational Autoencoders with Overlapping Transformations

Training of discrete latent variable models remains challenging because passing gradient information through discrete units is difficult. We propose a new class of smoothing transformations based on a mixture of two overlapping distributions, and show that the proposed transformation can be used for training binary latent models with either directed or undirected priors. We derive a new variational bound to efficiently train with Boltzmann machine priors. Using this bound, we develop DVAE++, a generative model with a global discrete prior and a hierarchy of convolutional continuous variables. Experiments on several benchmarks show that overlapping transformations outperform other recent continuous relaxations of discrete latent variables including Gumbel-Softmax (Maddison et al., 2016; Jang et al., 2016), and discrete variational autoencoders (Rolfe 2016).

Publication:

arXiv e-prints

Pub Date:

February 2018

DOI:

10.48550/arXiv.1802.04920

arXiv:

arXiv:1802.04920

Bibcode:

2018arXiv180204920V

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

Published as a conference paper at International Conference on Machine Learning (ICML), 2018

NASA/ADS

DVAE++: Discrete Variational Autoencoders with Overlapping Transformations

Abstract