In-the-wild Speech Emotion Conversion Using Disentangled Self-supervised Representations And Neural Vocoder-based Resynthesis
2023 Β· Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkmann
Abstract
Speech emotion conversion aims to convert the expressed emotion of a spoken utterance to a target emotion while preserving the lexical information and the speaker's identity. In this work, we specifically focus on in-the-wild emotion conversion where parallel data does not exist, and the problem of disentangling lexical, speaker, and emotion information arises. In this paper, we introduce a methodology that uses self-supervised networks to disentangle the lexical, speaker, and emotional content of the utterance, and subsequently uses a HiFiGAN vocoder to resynthesise the disentangled representations to a speech signal of the targeted emotion. For better representation and to achieve emotion intensity control, we specifically focus on the aro\-usal dimension of continuous representations, as opposed to performing emotion conversion on categorical representations. We test our methodology on the large in-the-wild MSP-Podcast dataset. Results reveal that the proposed approach is aptly cond
Authors
(none)
Tags
Stats
Related papers
- EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion For Non-parallel And In-the-wild Data (2023)5.84
- Converting Anyone's Emotion: Towards Speaker-independent Emotional Voice Conversion (2020)11.39
- Seen And Unseen Emotional Style Transfer For Voice Conversion With A New Emotional Speech Dataset (2020)16.34
- Textless Speech Emotion Conversion Using Discrete And Decomposed Representations (2021)10.74
- Non-parallel Emotion Conversion Using A Deep-generative Hybrid Network And An Adversarial Pair Discriminator (2020)6.77
- Nonparallel Emotional Speech Conversion (2018)11.08
- Multi-speaker Emotion Conversion Via Latent Variable Regularization And A Chained Encoder-decoder-predictor Network (2020)5.84
- VAW-GAN For Disentanglement And Recomposition Of Emotional Elements In Speech (2020)10.74