Causal Representation Learning for Context-Aware Face Transfer
Abstract
Human face synthesis involves transferring knowledge about the identity and identity-dependent face shape (IDFS) of a human face to target face images where the context (e.g., facial expressions, head poses, and other background factors) may change dramatically. Human faces are non-rigid, so facial expression leads to deformation of face shape, and head pose also affects the face observed in 2D images. A key challenge in face transfer is to match the face with unobserved new contexts, adapting the face appearance to different poses and expressions accordingly. In this work, we find a way to provide prior knowledge for generative models to reason about the appropriate appearance of a human face in response to various expressions and poses. We propose a novel context-aware face transfer method, called CarTrans, that incorporates causal effects of contextual factors into face representation, and thus is able to be aware of the uncertainty of new contexts. We estimate the effect of facial expression and head pose in terms of counterfactual inference by designing a controlled intervention trial, thus avoiding the requirement of a large number of observations to cover the pose-expression space well. Moreover, we propose a kernel regression-based encoder that eliminates the identity specificity of target faces when encoding contextual information from target images. The resulting method shows impressive performance, allowing fine-grained control over face shape and appearance under various contextual conditions.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2021
- DOI:
- 10.48550/arXiv.2110.01571
- arXiv:
- arXiv:2110.01571
- Bibcode:
- 2021arXiv211001571G
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Statistics - Methodology