Domain-invariant Speaker Vector Projection By Model-agnostic Meta-learning
2020 Β· Jiawen Kang, Ruiqi Liu, Lantian Li, et al.
Abstract
Domain generalization remains a critical problem for speaker recognition, even with the state-of-the-art architectures based on deep neural nets. For example, a model trained on reading speech may largely fail when applied to scenarios of singing or movie. In this paper, we propose a domain-invariant projection to improve the generalizability of speaker vectors. This projection is a simple neural net and is trained following the Model-Agnostic Meta-Learning (MAML) principle, for which the objective is to classify speakers in one domain if it had been updated with speech data in another domain. We tested the proposed method on CNCeleb, a new dataset consisting of single-speaker multi-condition (SSMC) data. The results demonstrated that the MAML-based domain-invariant projection can produce more generalizable speaker vectors, and effectively improve the performance in unseen domains.
Authors
(none)
Tags
Stats
Related papers
- Multi-domain Adaptation By Self-supervised Learning For Speaker Verification (2023)0.00
- Vae-based Domain Adaptation For Speaker Verification (2019)7.50
- Adversarial Training For Multi-domain Speaker Recognition (2020)6.77
- Generative Adversarial Speaker Embedding Networks For Domain Robust End-to-end Speaker Verification (2018)0.00
- DEAAN: Disentangled Embedding And Adversarial Adaptation Network For Robust Speaker Representation Learning (2020)9.59
- Meta-learning With Latent Space Clustering In Generative Adversarial Network For Speaker Diarization (2020)9.03
- Improved Meta-learning Training For Speaker Verification (2021)4.52
- Locale-agnostic Universal Domain Classification Model In Spoken Language Understanding (2019)5.24