Autoencoding beyond pixels using a learned similarity metric
Abstract
We present an autoencoder that leverages learned representations to better measure similarities in data space. By combining a variational autoencoder with a generative adversarial network we can use learned feature representations in the GAN discriminator as basis for the VAE reconstruction objective. Thereby, we replace element-wise errors with feature-wise errors to better capture the data distribution while offering invariance towards e.g. translation. We apply our method to images of faces and show that it outperforms VAEs with element-wise similarity measures in terms of visual fidelity. Moreover, we show that the method learns an embedding in which high-level abstract visual features (e.g. wearing glasses) can be modified using simple arithmetic.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2015
- DOI:
- 10.48550/arXiv.1512.09300
- arXiv:
- arXiv:1512.09300
- Bibcode:
- 2015arXiv151209300B
- Keywords:
-
- Computer Science - Machine Learning;
- Computer Science - Computer Vision and Pattern Recognition;
- Statistics - Machine Learning