Joint Scattering for Automatic Chick Call Recognition
Abstract
Animal vocalisations contain important information about health, emotional state, and behaviour, thus can be potentially used for animal welfare monitoring. Motivated by the spectro-temporal patterns of chick calls in the time$-$frequency domain, in this paper we propose an automatic system for chick call recognition using the joint time$-$frequency scattering transform (JTFS). Taking full-length recordings as input, the system first extracts chick call candidates by an onset detector and silence removal. After computing their JTFS features, a support vector machine classifier groups each candidate into different chick call types. Evaluating on a dataset comprising 3013 chick calls collected in laboratory conditions, the proposed recognition system using the JTFS features improves the frame- and event-based macro F-measures by 9.5% and 11.7%, respectively, than that of a mel-frequency cepstral coefficients baseline.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2021
- DOI:
- 10.48550/arXiv.2110.03965
- arXiv:
- arXiv:2110.03965
- Bibcode:
- 2021arXiv211003965W
- Keywords:
-
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Computer Science - Sound
- E-Print:
- 5 pages, submitted to ICASSP 2022