ZSDEVC: Zero-shot Diffusion-based Emotional Voice Conversion With Disentangled Mechanism
2024 Β· Hsing-Hang Chou, Yun-Shao Lin, Ching-Chin Sung, et al.
Abstract
The human voice conveys not just words but also emotional states and individuality. Emotional voice conversion (EVC) modifies emotional expressions while preserving linguistic content and speaker identity, improving applications like human-machine interaction. While deep learning has advanced EVC models for specific target speakers on well-crafted emotional datasets, existing methods often face issues with emotion accuracy and speech distortion. In addition, the zero-shot scenario, in which emotion conversion is applied to unseen speakers, remains underexplored. This work introduces a novel diffusion framework with disentangled mechanisms and expressive guidance, trained on a large emotional speech dataset and evaluated on unseen speakers across in-domain and out-of-domain datasets. Experimental results show that our method produces expressive speech with high emotional accuracy, naturalness, and quality, showcasing its potential for broader EVC applications.
Authors
(none)
Tags
Stats
Related papers
- Emoreg: Directional Latent Vector Modeling For Emotional Intensity Regularization In Diffusion-based Voice Conversion (2024)2.26
- Mixed-evc: Mixed Emotion Synthesis And Control In Voice Conversion (2022)4.52
- Converting Anyone's Voice: End-to-end Expressive Voice Conversion With A Conditional Diffusion Model (2024)5.24
- An Overview & Analysis Of Sequence-to-sequence Emotional Voice Conversion (2022)8.60
- Limited Data Emotional Voice Conversion Leveraging Text-to-speech: Two-stage Sequence-to-sequence Training (2021)10.35
- Seen And Unseen Emotional Style Transfer For Voice Conversion With A New Emotional Speech Dataset (2020)16.34
- Robust Disentangled Variational Speech Representation Learning For Zero-shot Voice Conversion (2022)10.97
- ZSVC: Zero-shot Style Voice Conversion With Disentangled Latent Diffusion Models And Adversarial Training (2025)0.00