Statistical Models In Forensic Voice Comparison
2019 Β· Geoffrey Stewart Morrison, Ewald Enzinger, Daniel Ramos, et al.
Abstract
This chapter describes a number of signal-processing and statistical-modeling techniques that are commonly used to calculate likelihood ratios in human-supervised automatic approaches to forensic voice comparison. Techniques described include mel-frequency cepstral coefficients (MFCCs) feature extraction, Gaussian mixture model - universal background model (GMM-UBM) systems, i-vector - probabilistic linear discriminant analysis (i-vector PLDA) systems, deep neural network (DNN) based systems (including senone posterior i-vectors, bottleneck features, and embeddings / x-vectors), mismatch compensation, and score-to-likelihood-ratio conversion (aka calibration). Empirical validation of forensic-voice-comparison systems is also covered. The aim of the chapter is to bridge the gap between general introductions to forensic voice comparison and the highly technical automatic-speaker-recognition literature from which the signal-processing and statistical-modeling techniques are mostly drawn.
Authors
(none)
Tags
Stats
Related papers
- Bayesian Strategies For Likelihood Ratio Computation In Forensic Voice Comparison With Automatic Systems (2019)0.00
- A Text-independent Speaker Verification Model: A Comparative Analysis (2017)8.60
- Validation Of An ECAPA-TDNN System For Forensic Automatic Speaker Recognition Under Case Work Conditions (2023)8.09
- Nebula: F0 Estimation And Voicing Detection By Modeling The Statistical Properties Of Feature Extractors (2017)3.58
- Application Of ASV For Voice Identification After VC And Duration Predictor Improvement In TTS Models (2024)0.00
- Comparison Of Multiple Features And Modeling Methods For Text-dependent Speaker Verification (2017)0.00
- The Sound Of Silence: Efficiency Of First Digit Features In Synthetic Audio Detection (2022)7.50
- Distinguishing Neural Speech Synthesis Models Through Fingerprints In Speech Waveforms (2023)2.26