Real-Time 3D Facial Tracking via Cascaded Compositional Learning

doi:10.1109/TIP.2021.3065819

Real-Time 3D Facial Tracking via Cascaded Compositional Learning

We propose to learn a cascade of globally-optimized modular boosted ferns (GoMBF) to solve multi-modal facial motion regression for real-time 3D facial tracking from a monocular RGB camera. GoMBF is a deep composition of multiple regression models with each is a boosted ferns initially trained to predict partial motion parameters of the same modality, and then concatenated together via a global optimization step to form a singular strong boosted ferns that can effectively handle the whole regression target. It can explicitly cope with the modality variety in output variables, while manifesting increased fitting power and a faster learning speed comparing against the conventional boosted ferns. By further cascading a sequence of GoMBFs (GoMBF-Cascade) to regress facial motion parameters, we achieve competitive tracking performance on a variety of in-the-wild videos comparing to the state-of-the-art methods, which require much more training data or have higher computational complexity. It provides a robust and highly elegant solution to real-time 3D facial tracking using a small set of training data and hence makes it more practical in real-world applications.

Publication:

IEEE Transactions on Image Processing

Pub Date:

2021

DOI:

10.1109/TIP.2021.3065819

arXiv:

arXiv:2009.00935

Bibcode:

2021ITIP...30.3844L

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

doi:10.1109/TIP.2021.3065819

NASA/ADS

Real-Time 3D Facial Tracking via Cascaded Compositional Learning

Abstract