Everyone-can-sing: Zero-shot Singing Voice Synthesis And Conversion With Speech Reference
2025 Β· Shuqi Dai, Yunyun Wang, Roger B. Dannenberg, et al.
Abstract
We propose a unified framework for Singing Voice Synthesis (SVS) and Conversion (SVC), addressing the limitations of existing approaches in cross-domain SVS/SVC, poor output musicality, and scarcity of singing data. Our framework enables control over multiple aspects, including language content based on lyrics, performance attributes based on a musical score, singing style and vocal techniques based on a selector, and voice identity based on a speech sample. The proposed zero-shot learning paradigm consists of one SVS model and two SVC models, utilizing pre-trained content embeddings and a diffusion-based generator. The proposed framework is also trained on mixed datasets comprising both singing and speech audio, allowing singing voice cloning based on speech reference. Experiments show substantial improvements in timbre similarity and musicality over state-of-the-art baselines, providing insights into other low-data music tasks such as instrumental style transfer. Examples can be foun
Authors
(none)
Tags
Stats
Related papers
- Zero-shot Sing Voice Conversion: Built Upon Clustering-based Phoneme Representations (2024)0.00
- Samoye: Zero-shot Singing Voice Conversion Model Based On Feature Disentanglement And Enhancement (2024)3.50
- LDM-SVC: Latent Diffusion Model Based Zero-shot Any-to-any Singing Voice Conversion With Singer Guidance (2024)5.84
- Tcsinger: Zero-shot Singing Voice Synthesis With Style Transfer And Multi-level Style Control (2024)7.16
- Real-time And Accurate: Zero-shot High-fidelity Singing Voice Conversion With Multi-condition Flow Synthesis (2024)0.00
- Leveraging Diverse Semantic-based Audio Pretrained Models For Singing Voice Conversion (2023)0.00
- Self-supervised Singing Voice Pre-training Towards Speech-to-singing Conversion (2024)0.00
- Visinger2+: End-to-end Singing Voice Synthesis Augmented By Self-supervised Learning Representation (2024)4.52