Mixed-evc: Mixed Emotion Synthesis And Control In Voice Conversion
2022 Β· Kun Zhou, Berrak Sisman, Carlos Busso, et al.
Abstract
Emotional voice conversion (EVC) traditionally targets the transformation of spoken utterances from one emotional state to another, with previous research mainly focusing on discrete emotion categories. This paper departs from the norm by introducing a novel perspective: a nuanced rendering of mixed emotions and enhancing control over emotional expression. To achieve this, we propose a novel EVC framework, Mixed-EVC, which only leverages discrete emotion training labels. We construct an attribute vector that encodes the relationships among these discrete emotions, which is predicted using a ranking-based support vector machine and then integrated into a sequence-to-sequence (seq2seq) EVC framework. Mixed-EVC not only learns to characterize the input emotional style but also quantifies its relevance to other emotions during training. As a result, users have the ability to assign these attributes to achieve their desired rendering of mixed emotions. Objective and subjective evaluations c
Authors
(none)
Tags
Stats
Related papers
- An Overview & Analysis Of Sequence-to-sequence Emotional Voice Conversion (2022)8.60
- Limited Data Emotional Voice Conversion Leveraging Text-to-speech: Two-stage Sequence-to-sequence Training (2021)10.35
- Towards Realistic Emotional Voice Conversion Using Controllable Emotional Intensity (2024)5.84
- Emoreg: Directional Latent Vector Modeling For Emotional Intensity Regularization In Diffusion-based Voice Conversion (2024)2.26
- ZSDEVC: Zero-shot Diffusion-based Emotional Voice Conversion With Disentangled Mechanism (2024)0.00
- Decoupling Speaker-independent Emotions For Voice Conversion Via Source-filter Networks (2021)9.41
- Seen And Unseen Emotional Style Transfer For Voice Conversion With A New Emotional Speech Dataset (2020)16.34
- PAVITS: Exploring Prosody-aware VITS For End-to-end Emotional Voice Conversion (2024)8.35