Diff-sv: A Unified Hierarchical Framework For Noise-robust Speaker Verification Using Score-based Diffusion Probabilistic Models
2023 Β· Ju-Ho Kim, Jungwoo Heo, Hyun-Seo Shin, et al.
Abstract
Background noise considerably reduces the accuracy and reliability of speaker verification (SV) systems. These challenges can be addressed using a speech enhancement system as a front-end module. Recently, diffusion probabilistic models (DPMs) have exhibited remarkable noise-compensation capabilities in the speech enhancement domain. Building on this success, we propose Diff-SV, a noise-robust SV framework that leverages DPM. Diff-SV unifies a DPM-based speech enhancement system with a speaker embedding extractor, and yields a discriminative and noise-tolerable speaker representation through a hierarchical structure. The proposed model was evaluated under both in-domain and out-of-domain noisy conditions using the VoxCeleb1 test set, an external noise source, and the VOiCES corpus. The obtained experimental results demonstrate that Diff-SV achieves state-of-the-art performance, outperforming recently proposed noise-robust SV systems.
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Learning With Diffusion-based Multichannel Speech Enhancement For Speaker Verification Under Noisy Conditions (2023)0.00
- Diffusion-based Adversarial Purification For Speaker Verification (2023)6.34
- LC4SV: A Denoising Framework Learning To Compensate For Unseen Speaker Verification Models (2023)5.24
- Voiceextender: Short-utterance Text-independent Speaker Verification With Guided Diffusion Model (2023)4.52
- Gdiffuse: Diffusion-based Speech Enhancement With Noise Model Guidance (2025)0.00
- BDDM: Bilateral Denoising Diffusion Models For Fast And High-quality Speech Synthesis (2022)4.76
- A Unified Deep Learning Framework For Short-duration Speaker Verification In Adverse Environments (2020)9.41
- PAS: Partial Additive Speech Data Augmentation Method For Noise Robust Speaker Verification (2023)0.00