On-Device Domain Learning for Keyword Spotting on Low-Power Extreme Edge Embedded Systems
Abstract
Keyword spotting accuracy degrades when neural networks are exposed to noisy environments. On-site adaptation to previously unseen noise is crucial to recovering accuracy loss, and on-device learning is required to ensure that the adaptation process happens entirely on the edge device. In this work, we propose a fully on-device domain adaptation system achieving up to 14% accuracy gains over already-robust keyword spotting models. We enable on-device learning with less than 10 kB of memory, using only 100 labeled utterances to recover 5% accuracy after adapting to the complex speech noise. We demonstrate that domain adaptation can be achieved on ultra-low-power microcontrollers with as little as 806 mJ in only 14 s on always-on, battery-operated devices.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2024
- DOI:
- 10.48550/arXiv.2403.10549
- arXiv:
- arXiv:2403.10549
- Bibcode:
- 2024arXiv240310549C
- Keywords:
-
- Computer Science - Sound;
- Computer Science - Machine Learning;
- Electrical Engineering and Systems Science - Audio and Speech Processing
- E-Print:
- 5 pages, 2 tables, 2 figures. Accepted at IEEE AICAS 2024