Towards End-to-end F0 Voice Conversion Based On Dual-gan With Convolutional Wavelet Kernels
2021 · Clément Le Moine Veillon, Nicolas Obin, Axel Roebel
Abstract
This paper presents a end-to-end framework for the F0 transformation in the context of expressive voice conversion. A single neural network is proposed, in which a first module is used to learn F0 representation over different temporal scales and a second adversarial module is used to learn the transformation from one emotion to another. The first module is composed of a convolution layer with wavelet kernels so that the various temporal scales of F0 variations can be efficiently encoded. The single decomposition/transformation network allows to learn in a end-to-end manner the F0 decomposition that are optimal with respect to the transformation, directly from the raw F0 signal.
Authors
(none)
Tags
Stats
Related papers
- Transforming Spectrum And Prosody For Emotional Voice Conversion With Non-parallel Training Data (2020)12.54
- Converting Anyone's Emotion: Towards Speaker-independent Emotional Voice Conversion (2020)11.39
- Investigation Of F0 Conditioning And Fully Convolutional Networks In Variational Autoencoder Based Voice Conversion (2019)0.00
- Non-parallel Emotion Conversion Using A Deep-generative Hybrid Network And An Adversarial Pair Discriminator (2020)6.77
- Multi-speaker Emotion Conversion Via Latent Variable Regularization And A Chained Encoder-decoder-predictor Network (2020)5.84
- Wavecyclegan: Synthetic-to-natural Speech Waveform Conversion Using Cycle-consistent Adversarial Networks (2018)9.92
- Expressive-vc: Highly Expressive Voice Conversion With Attention Fusion Of Bottleneck And Perturbation Features (2022)9.03
- A Unified Model For Voice And Accent Conversion In Speech And Singing Using Self-supervised Learning And Feature Extraction (2024)0.00