Sound Event Localization based on Sound Intensity Vector Refined By DNN-Based Denoising and Source Separation
Abstract
We propose a direction-of-arrival (DOA) estimation method for Sound Event Localization and Detection (SELD). Direct estimation of DOA using a deep neural network (DNN), i.e. completely-datadriven approach, achieves high accuracy. However, there is a gap in the accuracy between DOA estimation for single and overlapping sources because they cannot incorporate physical knowledge. Meanwhile, although the accuracy of physics-based approaches is inferior to DNN-based approaches, it is robust for overlapping source. In this study, we consider a combination of physics-based and DNN-based approaches; the sound intensity vectors (IVs) for physics-based DOA estimation is refined based on DNN-based denoising and source separation. This method enables the accurate DOA estimation for both single and overlapping sources using a spherical microphone array. Experimental results show that the proposed method achieves state-of-the-art DOA estimation accuracy on an open dataset of the SELD.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2020
- DOI:
- 10.48550/arXiv.2002.05994
- arXiv:
- arXiv:2002.05994
- Bibcode:
- 2020arXiv200205994Y
- Keywords:
-
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Computer Science - Sound
- E-Print:
- 5 pages, 3 figures, to appear in IEEE ICASSP 2020