Meta-Learning Approaches for Improving Detection of Unseen Speech Deepfakes

doi:10.48550/arXiv.2410.20578

Meta-Learning Approaches for Improving Detection of Unseen Speech Deepfakes

Current speech deepfake detection approaches perform satisfactorily against known adversaries; however, generalization to unseen attacks remains an open challenge. The proliferation of speech deepfakes on social media underscores the need for systems that can generalize to unseen attacks not observed during training. We address this problem from the perspective of meta-learning, aiming to learn attack-invariant features to adapt to unseen attacks with very few samples available. This approach is promising since generating of a high-scale training dataset is often expensive or infeasible. Our experiments demonstrated an improvement in the Equal Error Rate (EER) from 21.67% to 10.42% on the InTheWild dataset, using just 96 samples from the unseen dataset. Continuous few-shot adaptation ensures that the system remains up-to-date.

Publication:

arXiv e-prints

Pub Date:

October 2024

DOI:

10.48550/arXiv.2410.20578

arXiv:

arXiv:2410.20578

Bibcode:

2024arXiv241020578K

Keywords:

Electrical Engineering and Systems Science - Audio and Speech Processing;
Computer Science - Artificial Intelligence;
Computer Science - Sound

E-Print:

6 pages, accepted to the IEEE Spoken Language Technology Workshop (SLT) 2024

ADS

Meta-Learning Approaches for Improving Detection of Unseen Speech Deepfakes

Abstract