Frequency-domain neural beamformers are the mainstream methods for recent multi-channel speech separation models. Despite their well-defined behaviors and the effectiveness, such frequency-domain beamformers still have the limitations of a bounded oracle performance and the difficulties of designing proper networks for the complex-valued operations. In this paper, we propose a time-domain generalized Wiener filter (TD-GWF), an extension to the conventional frequency-domain beamformers that has higher oracle performance and only involves real-valued operations. We also provide discussions on how TD-GWF can be connected to conventional frequency-domain beamformers. Experiment results show that a significant performance improvement can be achieved by replacing frequency-domain beamformers by the TD-GWF in the recently proposed sequential neural beamforming pipelines.
- Pub Date:
- December 2021
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Computer Science - Artificial Intelligence;
- Computer Science - Sound;
- Electrical Engineering and Systems Science - Signal Processing