Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction

doi:10.48550/arXiv.2306.08454

Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction

This paper describes a real-time General Speech Reconstruction (Gesper) system submitted to the ICASSP 2023 Speech Signal Improvement (SSI) Challenge. This novel proposed system is a two-stage architecture, in which the speech restoration is performed, and then cascaded by speech enhancement. We propose a complex spectral mapping-based generative adversarial network (CSM-GAN) as the speech restoration module for the first time. For noise suppression and dereverberation, the enhancement module is performed with fullband-wideband parallel processing. On the blind test set of ICASSP 2023 SSI Challenge, the proposed Gesper system, which satisfies the real-time condition, achieves 3.27 P.804 overall mean opinion score (MOS) and 3.35 P.835 overall MOS, ranked 1st in both track 1 and track 2.

Publication:

arXiv e-prints

Pub Date:

June 2023

DOI:

10.48550/arXiv.2306.08454

arXiv:

arXiv:2306.08454

Bibcode:

2023arXiv230608454L

Keywords:

Computer Science - Sound;
Electrical Engineering and Systems Science - Audio and Speech Processing

E-Print:

Accepted by InterSpeech 2023

ADS

Gesper: A Restoration-Enhancement Framework for General Speech Reconstruction

Abstract