Variational image compression with a scale hyperprior
Abstract
We describe an end-to-end trainable model for image compression based on variational autoencoders. The model incorporates a hyperprior to effectively capture spatial dependencies in the latent representation. This hyperprior relates to side information, a concept universal to virtually all modern image codecs, but largely unexplored in image compression using artificial neural networks (ANNs). Unlike existing autoencoder compression methods, our model trains a complex prior jointly with the underlying autoencoder. We demonstrate that this model leads to state-of-the-art image compression when measuring visual quality using the popular MS-SSIM index, and yields rate-distortion performance surpassing published ANN-based methods when evaluated using a more traditional metric based on squared error (PSNR). Furthermore, we provide a qualitative comparison of models trained for different distortion metrics.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2018
- DOI:
- 10.48550/arXiv.1802.01436
- arXiv:
- arXiv:1802.01436
- Bibcode:
- 2018arXiv180201436B
- Keywords:
-
- Electrical Engineering and Systems Science - Image and Video Processing;
- Computer Science - Information Theory
- E-Print:
- accepted as a conference contribution to International Conference on Learning Representations 2018