Decoding visemes: improving machine lipreading

doi:10.48550/arXiv.1710.01169

Decoding visemes: improving machine lipreading

To undertake machine lip-reading, we try to recognise speech from a visual signal. Current work often uses viseme classification supported by language models with varying degrees of success. A few recent works suggest phoneme classification, in the right circumstances, can outperform viseme classification. In this work we present a novel two-pass method of training phoneme classifiers which uses previously trained visemes in the first pass. With our new training algorithm, we show classification performance which significantly improves on previous lip-reading results.

Publication:

arXiv e-prints

Pub Date:

October 2017

DOI:

10.48550/arXiv.1710.01169

arXiv:

arXiv:1710.01169

Bibcode:

2017arXiv171001169B

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Electrical Engineering and Systems Science - Audio and Speech Processing

E-Print:

Helen L Bear and Richard Harvey. Decoding visemes: improving machine lipreading. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016. p2009-2013

NASA/ADS

Decoding visemes: improving machine lipreading

Abstract