Real-Time 3D Facial Tracking via Cascaded Compositional Learning
Abstract
We propose to learn a cascade of globally-optimized modular boosted ferns (GoMBF) to solve multi-modal facial motion regression for real-time 3D facial tracking from a monocular RGB camera. GoMBF is a deep composition of multiple regression models with each is a boosted ferns initially trained to predict partial motion parameters of the same modality, and then concatenated together via a global optimization step to form a singular strong boosted ferns that can effectively handle the whole regression target. It can explicitly cope with the modality variety in output variables, while manifesting increased fitting power and a faster learning speed comparing against the conventional boosted ferns. By further cascading a sequence of GoMBFs (GoMBF-Cascade) to regress facial motion parameters, we achieve competitive tracking performance on a variety of in-the-wild videos comparing to the state-of-the-art methods, which require much more training data or have higher computational complexity. It provides a robust and highly elegant solution to real-time 3D facial tracking using a small set of training data and hence makes it more practical in real-world applications.
- Publication:
-
IEEE Transactions on Image Processing
- Pub Date:
- 2021
- DOI:
- 10.1109/TIP.2021.3065819
- arXiv:
- arXiv:2009.00935
- Bibcode:
- 2021ITIP...30.3844L
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- doi:10.1109/TIP.2021.3065819