SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute
Abstract
Manipulating latent code in generative adversarial networks (GANs) for facial image synthesis mainly focuses on continuous attribute synthesis (e.g., age, pose and emotion), while discrete attribute synthesis (like face mask and eyeglasses) receives less attention. Directly applying existing works to facial discrete attributes may cause inaccurate results. In this work, we propose an innovative framework to tackle challenging facial discrete attribute synthesis via semantic decomposing, dubbed SD-GAN. To be concrete, we explicitly decompose the discrete attribute representation into two components, i.e. the semantic prior basis and offset latent representation. The semantic prior basis shows an initializing direction for manipulating face representation in the latent space. The offset latent presentation obtained by 3D-aware semantic fusion network is proposed to adjust prior basis. In addition, the fusion network integrates 3D embedding for better identity preservation and discrete attribute synthesis. The combination of prior basis and offset latent representation enable our method to synthesize photo-realistic face images with discrete attributes. Notably, we construct a large and valuable dataset MEGN (Face Mask and Eyeglasses images crawled from Google and Naver) for completing the lack of discrete attributes in the existing dataset. Extensive qualitative and quantitative experiments demonstrate the state-of-the-art performance of our method. Our code is available at: https://github.com/MontaEllis/SD-GAN.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2022
- DOI:
- 10.48550/arXiv.2207.05300
- arXiv:
- arXiv:2207.05300
- Bibcode:
- 2022arXiv220705300K
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- 16 pages, 12 figures, Accepted by ACM MM2022