MC-SEMamba: A Simple Multi-channel Extension of SEMamba

doi:10.48550/arXiv.2409.17898

MC-SEMamba: A Simple Multi-channel Extension of SEMamba

Transformer-based models have become increasingly popular and have impacted speech-processing research owing to their exceptional performance in sequence modeling. Recently, a promising model architecture, Mamba, has emerged as a potential alternative to transformer-based models because of its efficient modeling of long sequences. In particular, models like SEMamba have demonstrated the effectiveness of the Mamba architecture in single-channel speech enhancement. This paper aims to adapt SEMamba for multi-channel applications with only a small increase in parameters. The resulting system, MC-SEMamba, achieved results on the CHiME3 dataset that were comparable or even superior to several previous baseline models. Additionally, we found that increasing the number of microphones from 1 to 6 improved the speech enhancement performance of MC-SEMamba.

Publication:

arXiv e-prints

Pub Date:

September 2024

DOI:

10.48550/arXiv.2409.17898

arXiv:

arXiv:2409.17898

Bibcode:

2024arXiv240917898T

Keywords:

Electrical Engineering and Systems Science - Audio and Speech Processing;
Computer Science - Sound

ADS

MC-SEMamba: A Simple Multi-channel Extension of SEMamba

Abstract