Feature Enhancement With Deep Feature Losses For Speaker Verification
2019 · Saurabh Kataria, Phani Sankar Nidadavolu, Jesús Villalba, et al.
Abstract
Speaker Verification still suffers from the challenge of generalization to novel adverse environments. We leverage on the recent advancements made by deep learning based speech enhancement and propose a feature-domain supervised denoising based solution. We propose to use Deep Feature Loss which optimizes the enhancement network in the hidden activation space of a pre-trained auxiliary speaker embedding network. We experimentally verify the approach on simulated and real data. A simulated testing setup is created using various noise types at different SNR levels. For evaluation on real data, we choose BabyTrain corpus which consists of children recordings in uncontrolled environments. We observe consistent gains in every condition over the state-of-the-art augmented Factorized-TDNN x-vector system. On BabyTrain corpus, we observe relative gains of 10.38% and 12.40% in minDCF and EER respectively.
Authors
(none)
Tags
Stats
Related papers
- Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild (2020)0.00
- Unsupervised Feature Enhancement For Speaker Verification (2019)5.84
- How To Leverage Dnn-based Speech Enhancement For Multi-channel Speaker Verification? (2022)0.00
- Speech Denoising With Deep Feature Losses (2018)14.23
- Deep Speaker Feature Learning For Text-independent Speaker Verification (2017)12.54
- Cross-lingual Speaker Verification With Deep Feature Learning (2017)8.35
- Disentangled Speaker And Nuisance Attribute Embedding For Robust Speaker Verification (2020)8.60
- A Comparative Re-assessment Of Feature Extractors For Deep Speaker Embeddings (2020)8.09