Data Generation Using Pass-phrase-dependent Deep Auto-encoders For Text-dependent Speaker Verification
2021 Β· Achintya Kumar Sarkar, Md Sahidullah, Zheng-Hua Tan
Abstract
In this paper, we propose a novel method that trains pass-phrase specific deep neural network (PP-DNN) based auto-encoders for creating augmented data for text-dependent speaker verification (TD-SV). Each PP-DNN auto-encoder is trained using the utterances of a particular pass-phrase available in the target enrollment set with two methods: (i) transfer learning and (ii) training from scratch. Next, feature vectors of a given utterance are fed to the PP-DNNs and the output from each PP-DNN at frame-level is considered one new set of generated data. The generated data from each PP-DNN is then used for building a TD-SV system in contrast to the conventional method that considers only the evaluation data available. The proposed approach can be considered as the transformation of data to the pass-phrase specific space using a non-linear transformation learned by each PP-DNN. The method develops several TD-SV systems with the number equal to the number of PP-DNNs separately trained for each
Authors
(none)
Tags
Stats
Related papers
- Data Augmentation Enhanced Speaker Enrollment For Text-dependent Speaker Verification (2020)0.00
- Exploring The Use Of An Unsupervised Autoregressive Model As A Shared Encoder For Text-dependent Speaker Verification (2020)5.84
- Exploring Voice Conversion Based Data Augmentation In Text-dependent Speaker Verification (2020)0.00
- Speaker Verification-derived Loss And Data Augmentation For Dnn-based Multispeaker Speech Synthesis (2021)3.58
- Joint Speaker Encoder And Neural Back-end Model For Fully End-to-end Automatic Speaker Verification With Multiple Enrollment Utterances (2022)0.00
- Text-independent Speaker Verification Based On Deep Neural Networks And Segmental Dynamic Time Warping (2018)3.58
- PAS: Partial Additive Speech Data Augmentation Method For Noise Robust Speaker Verification (2023)0.00
- ECAPA-TDNN: Emphasized Channel Attention, Propagation And Aggregation In TDNN Based Speaker Verification (2020)23.07