Multi-Branch Learning for Weakly-Labeled Sound Event Detection
Abstract
There are two sub-tasks implied in the weakly-supervised SED: audio tagging and event boundary detection. Current methods which combine multi-task learning with SED requires annotations both for these two sub-tasks. Since there are only annotations for audio tagging available in weakly-supervised SED, we design multiple branches with different learning purposes instead of pursuing multiple tasks. Similar to multiple tasks, multiple different learning purposes can also prevent the common feature which the multiple branches share from overfitting to any one of the learning purposes. We design these multiple different learning purposes based on combinations of different MIL strategies and different pooling methods. Experiments on the DCASE 2018 Task 4 dataset and the URBAN-SED dataset both show that our method achieves competitive performance.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2020
- DOI:
- 10.48550/arXiv.2002.09661
- arXiv:
- arXiv:2002.09661
- Bibcode:
- 2020arXiv200209661H
- Keywords:
-
- Electrical Engineering and Systems Science - Audio and Speech Processing
- E-Print:
- Accepted by ICASSP 2020