Cross-domain Adaptation With Discrepancy Minimization For Text-independent Forensic Speaker Verification
2020 Β· Zhenyu Wang, Wei Xia, John H. L. Hansen
Abstract
Forensic audio analysis for speaker verification offers unique challenges due to location/scenario uncertainty and diversity mismatch between reference and naturalistic field recordings. The lack of real naturalistic forensic audio corpora with ground-truth speaker identity represents a major challenge in this field. It is also difficult to directly employ small-scale domain-specific data to train complex neural network architectures due to domain mismatch and loss in performance. Alternatively, cross-domain speaker verification for multiple acoustic environments is a challenging task which could advance research in audio forensics. In this study, we introduce a CRSS-Forensics audio dataset collected in multiple acoustic environments. We pre-train a CNN-based network using the VoxCeleb data, followed by an approach which fine-tunes part of the high-level network layers with clean speech from CRSS-Forensics. Based on this fine-tuned model, we align domain-specific distributions in the e
Authors
(none)
Tags
Stats
Related papers
- Multi-domain Adaptation By Self-supervised Learning For Speaker Verification (2023)0.00
- Open-set Short Utterance Forensic Speaker Verification Using Teacher-student Network With Explicit Inductive Bias (2020)9.41
- Self-supervised Learning Based Domain Adaptation For Robust Speaker Verification (2021)11.49
- Validation Of An ECAPA-TDNN System For Forensic Automatic Speaker Recognition Under Case Work Conditions (2023)8.09
- Source -free Domain Adaptation For Speaker Verification In Data-scarce Languages And Noisy Channels (2024)0.00
- Adapting End-to-end Neural Speaker Verification To New Languages And Recording Conditions With Adversarial Training (2018)9.59
- Adversarial Training For Multi-domain Speaker Recognition (2020)6.77
- Speaker Verification Using End-to-end Adversarial Language Adaptation (2018)11.19