Learning Noise-independent Speech Representation For High-quality Voice Conversion For Noisy Target Speakers
2022 Β· Liumeng Xue, Shan Yang, Na Hu, et al.
Abstract
Building a voice conversion system for noisy target speakers, such as users providing noisy samples or Internet found data, is a challenging task since the use of contaminated speech in model training will apparently degrade the conversion performance. In this paper, we leverage the advances of our recently proposed Glow-WaveGAN and propose a noise-independent speech representation learning approach for high-quality voice conversion for noisy target speakers. Specifically, we learn a latent feature space where we ensure that the target distribution modeled by the conversion model is exactly from the modeled distribution of the waveform generator. With this premise, we further manage to make the latent feature to be noise-invariant. Specifically, we introduce a noise-controllable WaveGAN, which directly learns the noise-independent acoustic representation from waveform by the encoder and conducts noise control in the hidden space through a FiLM module in the decoder. As for the conversi
Authors
(none)
Tags
Stats
Related papers
- Glow-wavegan 2: High-quality Zero-shot Text-to-speech Synthesis And Any-to-any Voice Conversion (2022)7.50
- SLMGAN: Exploiting Speech Language Model Representations For Unsupervised Zero-shot Voice Conversion In Gans (2023)0.00
- Multi-target Voice Conversion Without Parallel Data By Adversarially Learning Disentangled Audio Representations (2018)13.60
- Glow-wavegan: Learning Speech Representations From Gan-based Variational Auto-encoder For High Fidelity Flow-based Speech Synthesis (2021)8.35
- Rep2wav: Noise Robust Text-to-speech Using Self-supervised Representations (2023)0.00
- Wavecyclegan: Synthetic-to-natural Speech Waveform Conversion Using Cycle-consistent Adversarial Networks (2018)9.92
- Cyclegan Voice Conversion Of Spectral Envelopes Using Adversarial Weights (2019)6.77
- Towards Low-resource Stargan Voice Conversion Using Weight Adaptive Instance Normalization (2020)7.81