Capsule Routing for Sound Event Detection
Abstract
The detection of acoustic scenes is a challenging problem in which environmental sound events must be detected from a given audio signal. This includes classifying the events as well as estimating their onset and offset times. We approach this problem with a neural network architecture that uses the recently-proposed capsule routing mechanism. A capsule is a group of activation units representing a set of properties for an entity of interest, and the purpose of routing is to identify part-whole relationships between capsules. That is, a capsule in one layer is assumed to belong to a capsule in the layer above in terms of the entity being represented. Using capsule routing, we wish to train a network that can learn global coherence implicitly, thereby improving generalization performance. Our proposed method is evaluated on Task 4 of the DCASE 2017 challenge. Results show that classification performance is state-of-the-art, achieving an F-score of 58.6%. In addition, overfitting is reduced considerably compared to other architectures.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2018
- DOI:
- 10.48550/arXiv.1806.04699
- arXiv:
- arXiv:1806.04699
- Bibcode:
- 2018arXiv180604699I
- Keywords:
-
- Computer Science - Sound;
- Electrical Engineering and Systems Science - Audio and Speech Processing
- E-Print:
- Paper accepted for 26th European Signal Processing Conference (EUSIPCO 2018)