Vocal Tract Length Perturbation For Text-dependent Speaker Verification With Autoregressive Prediction Coding
2020 Β· Achintya Kr. Sarkar, Zheng-Hua Tan
Abstract
In this letter, we propose a vocal tract length (VTL) perturbation method for text-dependent speaker verification (TD-SV), in which a set of TD-SV systems are trained, one for each VTL factor, and score-level fusion is applied to make a final decision. Next, we explore the bottleneck (BN) feature extracted by training deep neural networks with a self-supervised objective, autoregressive predictive coding (APC), for TD-SV and compare it with the well-studied speaker-discriminant BN feature. The proposed VTL method is then applied to APC and speaker-discriminant BN features. In the end, we combine the VTL perturbation systems trained on MFCC and the two BN features in the score domain. Experiments are performed on the RedDots challenge 2016 database of TD-SV using short utterances with Gaussian mixture model-universal background model and i-vector techniques. Results show the proposed methods significantly outperform the baselines.
Authors
(none)
Tags
Stats
Related papers
- The SVASR System For Text-dependent Speaker Verification (tdsv) AAIC Challenge 2024 (2024)0.00
- Time-contrastive Learning Based DNN Bottleneck Features For Text-dependent Speaker Verification (2017)9.92
- On Bottleneck Features For Text-dependent Speaker Verification Using X-vectors (2020)0.00
- Time-contrastive Learning Based Deep Bottleneck Features For Text-dependent Speaker Verification (2019)9.92
- Exploring The Use Of An Unsupervised Autoregressive Model As A Shared Encoder For Text-dependent Speaker Verification (2020)5.84
- A Bayesian Approach To Estimation Of Speaker Normalization Parameters (2016)0.00
- Noise-robust Target-speaker Voice Activity Detection Through Self-supervised Pretraining (2025)0.00
- P-vectors: A Parallel-coupled Tdnn/transformer Network For Speaker Verification (2023)5.84