Adapting End-to-end Neural Speaker Verification To New Languages And Recording Conditions With Adversarial Training
2018 Β· Gautam Bhattacharya, Jahangir Alam, Patrick Kenny
Abstract
In this article we propose a novel approach for adapting speaker embeddings to new domains based on adversarial training of neural networks. We apply our embeddings to the task of text-independent speaker verification, a challenging, real-world problem in biometric security. We further the development of end-to-end speaker embedding models by combing a novel 1-dimensional, self-attentive residual network, an angular margin loss function and adversarial training strategy. Our model is able to learn extremely compact, 64-dimensional speaker embeddings that deliver competitive performance on a number of popular datasets using simple cosine distance scoring. One the NIST-SRE 2016 task we are able to beat a strong i-vector baseline, while on the Speakers in the Wild task our model was able to outperform both i-vector and x-vector baselines, showing an absolute improvement of 2.19% over the latter. Additionally, we show that the integration of adversarial training consistently leads to a sig
Authors
(none)
Tags
Stats
Related papers
- Speaker Verification Using End-to-end Adversarial Language Adaptation (2018)11.19
- Generative Adversarial Speaker Embedding Networks For Domain Robust End-to-end Speaker Verification (2018)0.00
- Vae-based Domain Adaptation For Speaker Verification (2019)7.50
- DEAAN: Disentangled Embedding And Adversarial Adaptation Network For Robust Speaker Representation Learning (2020)9.59
- An End-to-end Text-independent Speaker Verification Framework With A Keyword Adversarial Network (2019)5.84
- Investigation Of Speaker-adaptation Methods In Transformer Based ASR (2020)0.00
- Zero-shot Multi-speaker Text-to-speech With State-of-the-art Neural Speaker Embeddings (2019)15.67
- Robust Speaker Recognition Using Unsupervised Adversarial Invariance (2019)9.76