Unsupervised Deep Representations for Learning Audience Facial Behaviors

doi:10.48550/arXiv.1805.04136

Unsupervised Deep Representations for Learning Audience Facial Behaviors

In this paper, we present an unsupervised learning approach for analyzing facial behavior based on a deep generative model combined with a convolutional neural network (CNN). We jointly train a variational auto-encoder (VAE) and a generative adversarial network (GAN) to learn a powerful latent representation from footage of audiences viewing feature-length movies. We show that the learned latent representation successfully encodes meaningful signatures of behaviors related to audience engagement (smiling & laughing) and disengagement (yawning). Our results provide a proof of concept for a more general methodology for annotating hard-to-label multimedia data featuring sparse examples of signals of interest.

Publication:

arXiv e-prints

Pub Date:

May 2018

DOI:

10.48550/arXiv.1805.04136

arXiv:

arXiv:1805.04136

Bibcode:

2018arXiv180504136S

Keywords:

Computer Science - Computer Vision and Pattern Recognition

NASA/ADS

Unsupervised Deep Representations for Learning Audience Facial Behaviors

Abstract