PMVC: Data Augmentation-based Prosody Modeling For Expressive Voice Conversion
2023 Β· Yimin Deng, Huaizhen Tang, Xulong Zhang, et al.
Abstract
Voice conversion as the style transfer task applied to speech, refers to converting one person's speech into a new speech that sounds like another person's. Up to now, there has been a lot of research devoted to better implementation of VC tasks. However, a good voice conversion model should not only match the timbre information of the target speaker, but also expressive information such as prosody, pace, pause, etc. In this context, prosody modeling is crucial for achieving expressive voice conversion that sounds natural and convincing. Unfortunately, prosody modeling is important but challenging, especially without text transcriptions. In this paper, we firstly propose a novel voice conversion framework named 'PMVC', which effectively separates and models the content, timbre, and prosodic information from the speech without text transcriptions. Specially, we introduce a new speech augmentation algorithm for robust prosody extraction. And building upon this, mask and predict mechanism
Authors
(none)
Tags
Stats
Related papers
- Enhancing Expressive Voice Conversion With Discrete Pitch-conditioned Flow Matching Model (2025)5.84
- Converting Anyone's Voice: End-to-end Expressive Voice Conversion With A Conditional Diffusion Model (2024)5.24
- Expressive-vc: Highly Expressive Voice Conversion With Attention Fusion Of Bottleneck And Perturbation Features (2022)9.03
- Towards General-purpose Text-instruction-guided Voice Conversion (2023)0.00
- Highly Controllable Diffusion-based Any-to-any Voice Conversion Model With Frame-level Prosody Feature (2023)0.00
- Enriching Source Style Transfer In Recognition-synthesis Based Non-parallel Voice Conversion (2021)9.23
- MSM-VC: High-fidelity Source Style Transfer For Non-parallel Voice Conversion By Multi-scale Style Modeling (2023)5.84
- Zero-shot Voice Conversion Via Self-supervised Prosody Representation Learning (2021)6.34