An Evaluation Of Three-stage Voice Conversion Framework For Noisy And Reverberant Conditions
2022 Β· Yeonjong Choi, Chao Xie, Tomoki Toda
Abstract
This paper presents a new voice conversion (VC) framework capable of dealing with both additive noise and reverberation, and its performance evaluation. There have been studied some VC researches focusing on real-world circumstances where speech data are interfered with background noise and reverberation. To deal with more practical conditions where no clean target dataset is available, one possible approach is zero-shot VC, but its performance tends to degrade compared with VC using sufficient amount of target speech data. To leverage large amount of noisy-reverberant target speech data, we propose a three-stage VC framework based on denoising process using a pretrained denoising model, dereverberation process using a dereverberation model, and VC process using a nonparallel VC model based on a variational autoencoder. The experimental results show that 1) noise and reverberation additively cause significant VC performance degradation, 2) the proposed method alleviates the adverse eff
Authors
(none)
Tags
Stats
Related papers
- Voicy: Zero-shot Non-parallel Voice Conversion In Noisy Reverberant Environments (2021)5.24
- Noise-robust Voice Conversion By Conditional Denoising Training Using Latent Variables Of Recording Quality And Environment (2024)0.00
- VC-ENHANCE: Speech Restoration With Integrated Noise Suppression And Voice Conversion (2024)0.00
- Preserving Background Sound In Noise-robust Voice Conversion Via Multi-task Learning (2022)0.00
- Refined Wavenet Vocoder For Variational Autoencoder Based Voice Conversion (2018)7.50
- Investigation Of F0 Conditioning And Fully Convolutional Networks In Variational Autoencoder Based Voice Conversion (2019)0.00
- Robustness Of Voice Conversion Techniques Under Mismatched Conditions (2016)0.00
- The NU Voice Conversion System For The Voice Conversion Challenge 2020: On The Effectiveness Of Sequence-to-sequence Models And Autoregressive Neural Vocoders (2020)3.58