Neural Architecture Search on Acoustic Scene Classification

doi:10.48550/arXiv.1912.12825

Neural Architecture Search on Acoustic Scene Classification

Convolutional neural networks are widely adopted in Acoustic Scene Classification (ASC) tasks, but they generally carry a heavy computational burden. In this work, we propose a lightweight yet high-performing baseline network inspired by MobileNetV2, which replaces square convolutional kernels with unidirectional ones to extract features alternately in temporal and frequency dimensions. Furthermore, we explore a dynamic architecture space built on the basis of the proposed baseline with the recent Neural Architecture Search (NAS) paradigm, which first trains a supernet that incorporates all candidate networks and then applies a well-known evolutionary algorithm NSGA-II to discover more efficient networks with higher accuracy and lower computational cost. Experimental results demonstrate that our searched network is competent in ASC tasks, which achieves 90.3% F1-score on the DCASE2018 task 5 evaluation set, marking a new state-of-the-art performance while saving 25% of FLOPs compared to our baseline network.

Publication:

arXiv e-prints

Pub Date:

December 2019

DOI:

10.48550/arXiv.1912.12825

arXiv:

arXiv:1912.12825

Bibcode:

2019arXiv191212825L

Keywords:

Computer Science - Sound;
Computer Science - Machine Learning;
Electrical Engineering and Systems Science - Audio and Speech Processing;
Statistics - Machine Learning

E-Print:

Accepted to Interspeech 2020

NASA/ADS

Neural Architecture Search on Acoustic Scene Classification

Abstract