Embeddings For DNN Speaker Adaptive Training
2019 Β· Joanna Rownicka, Peter Bell, Steve Renals
Abstract
In this work, we investigate the use of embeddings for speaker-adaptive training of DNNs (DNN-SAT) focusing on a small amount of adaptation data per speaker. DNN-SAT can be viewed as learning a mapping from each embedding to transformation parameters that are applied to the shared parameters of the DNN. We investigate different approaches to applying these transformations, and find that with a good training strategy, a multi-layer adaptation network applied to all hidden layers is no more effective than a single linear layer acting on the embeddings to transform the input features. In the second part of our work, we evaluate different embeddings (i-vectors, x-vectors and deep CNN embeddings) in an additional speaker recognition task in order to gain insight into what should characterize an embedding for DNN-SAT. We find the performance for speaker recognition of a given representation is not correlated with its ASR performance; in fact, ability to capture more speech attributes than ju
Authors
(none)
Tags
Stats
Related papers
- Embedding-based Speaker Adaptive Training Of Deep Neural Networks (2017)9.76
- Adapting End-to-end Neural Speaker Verification To New Languages And Recording Conditions With Adversarial Training (2018)9.59
- Analyzing Deep Cnn-based Utterance Embeddings For Acoustic Model Adaptation (2018)6.77
- An Improved Deep Neural Network For Modeling Speaker Characteristics At Different Temporal Scales (2020)6.34
- Vae-based Domain Adaptation For Speaker Verification (2019)7.50
- On Deep Speaker Embeddings For Text-independent Speaker Recognition (2018)11.93
- Investigation Of Speaker-adaptation Methods In Transformer Based ASR (2020)0.00
- Deep Speaker Embedding Learning With Multi-level Pooling For Text-independent Speaker Verification (2019)0.00