Channel-Spatial-Based Few-Shot Bird Sound Event Detection
Abstract
In this paper, we propose a model for bird sound event detection that focuses on a small number of training samples within the everyday long-tail distribution. As a result, we investigate bird sound detection using the few-shot learning paradigm. By integrating channel and spatial attention mechanisms, improved feature representations can be learned from few-shot training datasets. We develop a Metric Channel-Spatial Network model by incorporating a Channel Spatial Squeeze-Excitation block into the prototype network, combining it with these attention mechanisms. We evaluate the Metric Channel Spatial Network model on the DCASE 2022 Take5 dataset benchmark, achieving an F-measure of 66.84% and a PSDS of 58.98%. Our experiment demonstrates that the combination of channel and spatial attention mechanisms effectively enhances the performance of bird sound classification and detection.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2023
- DOI:
- 10.48550/arXiv.2306.10499
- arXiv:
- arXiv:2306.10499
- Bibcode:
- 2023arXiv230610499L
- Keywords:
-
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Computer Science - Sound
- E-Print:
- 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference