PERSA+: A Deep Learning Front-End for Context-Agnostic Audio Classification
Abstract
Deep learning has been applied to diverse audio semantics tasks, enabling the construction of models that learn hierarchical levels of features from high-dimensional raw data, delivering state-of-the-art performance. But do these algorithms perform similarly in real-world conditions, or just at the benchmark, where their high learning capability assures the complete memorization of the employed datasets? This work presents a deep learning front-end, aiming at discarding detrimental information before entering the modeling stage, bringing the learning process closer to the point, anticipating the development of robust and context-agnostic classification algorithms.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2021
- DOI:
- 10.48550/arXiv.2107.09311
- arXiv:
- arXiv:2107.09311
- Bibcode:
- 2021arXiv210709311V
- Keywords:
-
- Computer Science - Sound;
- Electrical Engineering and Systems Science - Audio and Speech Processing