A Vocoder-free Wavenet Voice Conversion With Non-parallel Data
2019 Β· Xiaohai Tian, Eng Siong Chng, Haizhou Li
Abstract
In a typical voice conversion system, vocoder is commonly used for speech-to-features analysis and features-to-speech synthesis. However, vocoder can be a source of speech quality degradation. This paper presents a vocoder-free voice conversion approach using WaveNet for non-parallel training data. Instead of dealing with the intermediate features, the proposed approach utilizes the WaveNet to map the Phonetic PosteriorGrams (PPGs) to the waveform samples directly. In this way, we avoid the estimation errors caused by vocoder and feature conversion. Additionally, as PPG is assumed to be speaker independent, the proposed method also reduces the feature mismatch problem in WaveNet vocoder based approaches. Experimental results conducted on the CMU-ARCTIC database show that the proposed approach significantly outperforms the baseline approaches in terms of speech quality.
Authors
(none)
Tags
Stats
Related papers
- Non-parallel Voice Conversion System With Wavenet Vocoder And Collapsed Speech Suppression (2020)3.58
- Statistical Voice Conversion With Quasi-periodic Wavenet Vocoder (2019)3.58
- High-quality Nonparallel Voice Conversion Based On Cycle-consistent Adversarial Network (2018)0.00
- Parallel-data-free Voice Conversion Using Cycle-consistent Adversarial Networks (2017)0.00
- Refined Wavenet Vocoder For Variational Autoencoder Based Voice Conversion (2018)7.50
- Baseline System Of Voice Conversion Challenge 2020 With Cyclic Variational Autoencoder And Parallel Wavegan (2020)4.24
- Voice Conversion From Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks (2017)16.34
- Vocoder-free Non-parallel Conversion Of Whispered Speech With Masked Cycle-consistent Generative Adversarial Networks (2023)0.00