Statistical Models in Forensic Voice Comparison
Abstract
This chapter describes a number of signalprocessing and statisticalmodeling techniques that are commonly used to calculate likelihood ratios in humansupervised automatic approaches to forensic voice comparison. Techniques described include melfrequency cepstral coefficients (MFCCs) feature extraction, Gaussian mixture model  universal background model (GMMUBM) systems, ivector  probabilistic linear discriminant analysis (ivector PLDA) systems, deep neural network (DNN) based systems (including senone posterior ivectors, bottleneck features, and embeddings / xvectors), mismatch compensation, and scoretolikelihoodratio conversion (aka calibration). Empirical validation of forensicvoicecomparison systems is also covered. The aim of the chapter is to bridge the gap between general introductions to forensic voice comparison and the highly technical automaticspeakerrecognition literature from which the signalprocessing and statisticalmodeling techniques are mostly drawn. Knowledge of the likelihoodratio framework for the evaluation of forensic evidence is assumed. It is hoped that the material presented here will be of value to students of forensic voice comparison and to researchers interested in learning about statistical modeling techniques that could potentially also be applied to data from other branches of forensic science.
 Publication:

arXiv eprints
 Pub Date:
 December 2019
 arXiv:
 arXiv:1912.13242
 Bibcode:
 2019arXiv191213242M
 Keywords:

 Statistics  Applications;
 Computer Science  Sound;
 Electrical Engineering and Systems Science  Audio and Speech Processing
 EPrint:
 Morrison, G.S., Enzinger, E., Ramos, D., Gonz\'alezRodr\'iguez, J., LozanoD\'iez, A. (2020). Statistical models in forensic voice comparison. In Banks, D.L., Kafadar, K., Kaye, D.H., Tackett, M. (Eds.) Handbook of Forensic Statistics (Ch. 20, pp. 451497). Boca Raton, FL: CRC