Acoustic echo cancellation with the dual-signal transformation LSTM network

doi:10.48550/arXiv.2010.14337

Acoustic echo cancellation with the dual-signal transformation LSTM network

This paper applies the dual-signal transformation LSTM network (DTLN) to the task of real-time acoustic echo cancellation (AEC). The DTLN combines a short-time Fourier transformation and a learned feature representation in a stacked network approach, which enables robust information processing in the time-frequency and in the time domain, which also includes phase information. The model is only trained on 60~h of real and synthetic echo scenarios. The training setup includes multi-lingual speech, data augmentation, additional noise and reverberation to create a model that should generalize well to a large variety of real-world conditions. The DTLN approach produces state-of-the-art performance on clean and noisy echo conditions reducing acoustic echo and additional noise robustly. The method outperforms the AEC-Challenge baseline by 0.30 in terms of Mean Opinion Score (MOS).

Publication:

arXiv e-prints

Pub Date:

October 2020

DOI:

10.48550/arXiv.2010.14337

arXiv:

arXiv:2010.14337

Bibcode:

2020arXiv201014337W

Keywords:

Electrical Engineering and Systems Science - Audio and Speech Processing

E-Print:

Submitted in to ICASSP 2021

NASA/ADS

Acoustic echo cancellation with the dual-signal transformation LSTM network

Abstract