Wavelet-based Unsupervised Label-to-Image Translation
Abstract
Semantic Image Synthesis (SIS) is a subclass of image-to-image translation where a semantic layout is used to generate a photorealistic image. State-of-the-art conditional Generative Adversarial Networks (GANs) need a huge amount of paired data to accomplish this task while generic unpaired image-to-image translation frameworks underperform in comparison, because they color-code semantic layouts and learn correspondences in appearance instead of semantic content. Starting from the assumption that a high quality generated image should be segmented back to its semantic layout, we propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination. Furthermore, in order to match the high-frequency distribution of real images, a novel generator architecture in the wavelet domain is proposed. We test our methodology on 3 challenging datasets and demonstrate its ability to bridge the performance gap between paired and unpaired models.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2023
- DOI:
- 10.48550/arXiv.2305.09647
- arXiv:
- arXiv:2305.09647
- Bibcode:
- 2023arXiv230509647E
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Electrical Engineering and Systems Science - Image and Video Processing
- E-Print:
- arXiv admin note: substantial text overlap with arXiv:2109.14715