Validation Of An ECAPA-TDNN System For Forensic Automatic Speaker Recognition Under Case Work Conditions
2023 Β· Francesco Sigona, Mirko Grimaldi
Abstract
Different variants of a Forensic Automatic Speaker Recognition (FASR) system based on Emphasized Channel Attention, Propagation and Aggregation in Time Delay Neural Network (ECAPA-TDNN) are tested under conditions reflecting those of a real forensic voice comparison case, according to the forensic_eval_01 evaluation campaign settings. Using this recent neural model as an embedding extraction block, various normalization strategies at the level of embeddings and scores allow us to observe the variations in system performance, in terms of discriminating power, accuracy and precision metrics. From the achieved results it is possible to state that ECAPA-TDNN can be very successfully used as a base component of a FASR system, managing to surpass the previous state of the art, at least in the context of the considered operating conditions.
Authors
(none)
Tags
Stats
Related papers
- ECAPA-TDNN: Emphasized Channel Attention, Propagation And Aggregation In TDNN Based Speaker Verification (2020)23.07
- Cross-domain Adaptation With Discrepancy Minimization For Text-independent Forensic Speaker Verification (2020)8.60
- The HCCL Speaker Verification System For Far-field Speaker Verification Challenge (2021)0.00
- CAM++: A Fast And Efficient Network For Speaker Verification Using Context-aware Masking (2023)15.57
- MFA: TDNN With Multi-scale Frequency-channel Attention For Text-independent Speaker Verification With Short Utterances (2022)13.79
- ECAPA2: A Hybrid Neural Network Architecture And Training Strategy For Robust Speaker Embeddings (2024)0.00
- Parameterized Channel Normalization For Far-field Deep Speaker Verification (2021)3.58
- Vocal Style Factorization For Effective Speaker Recognition In Affective Scenarios (2023)0.00