Voiceid Loss: Speech Enhancement For Speaker Verification
2019 · Suwon Shon, Hao Tang, James Glass
Abstract
In this paper, we propose VoiceID loss, a novel loss function for training a speech enhancement model to improve the robustness of speaker verification. In contrast to the commonly used loss functions for speech enhancement such as the L2 loss, the VoiceID loss is based on the feedback from a speaker verification model to generate a ratio mask. The generated ratio mask is multiplied pointwise with the original spectrogram to filter out unnecessary components for speaker verification. In the experiments, we observed that the enhancement network, after training with the VoiceID loss, is able to ignore a substantial amount of time-frequency bins, such as those dominated by noise, for verification. The resulting model consistently improves the speaker verification system on both clean and noisy conditions.
Authors
(none)
Tags
Stats
Related papers
- Feature Enhancement With Deep Feature Losses For Speaker Verification (2019)10.61
- Improving Voice Quality In Speech Anonymization With Just Perception-informed Losses (2024)0.00
- Large Margin Softmax Loss For Speaker Verification (2019)14.66
- Unsupervised Feature Enhancement For Speaker Verification (2019)5.84
- Generalized End-to-end Loss For Speaker Verification (2017)20.58
- Improved Vocal Effort Transfer Vector Estimation For Vocal Effort-robust Speaker Verification (2023)0.00
- Optimizing Voice Conversion Network With Cycle Consistency Loss Of Speaker Identity (2020)9.59
- The Sound Of My Voice: Speaker Representation Loss For Target Voice Separation (2019)8.09