Estimating the confidence of speech spoofing countermeasure
Abstract
Conventional speech spoofing countermeasures (CMs) are designed to make a binary decision on an input trial. However, a CM trained on a closed-set database is theoretically not guaranteed to perform well on unknown spoofing attacks. In some scenarios, an alternative strategy is to let the CM defer a decision when it is not confident. The question is then how to estimate a CM's confidence regarding an input trial. We investigated a few confidence estimators that can be easily plugged into a CM. On the ASVspoof2019 logical access database, the results demonstrate that an energy-based estimator and a neural-network-based one achieved acceptable performance in identifying unknown attacks in the test set. On a test set with additional unknown attacks and bona fide trials from other databases, the confidence estimators performed moderately well, and the CMs better discriminated bona fide and spoofed trials that had a high confidence score. Additional results also revealed the difficulty in enhancing a confidence estimator by adding unknown attacks to the training set.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2021
- DOI:
- 10.48550/arXiv.2110.04775
- arXiv:
- arXiv:2110.04775
- Bibcode:
- 2021arXiv211004775W
- Keywords:
-
- Electrical Engineering and Systems Science - Audio and Speech Processing;
- Computer Science - Cryptography and Security;
- Computer Science - Sound
- E-Print:
- Work in progress. Comments are welcome. Accepted by ICASSP2022. Code is available https://github.com/nii-yamagishilab/project-NN-Pytorch-scripts. Not all the comments from anonymous reviewers can be addressed within 4 pages, apologize for that