Diffsinger: Singing Voice Synthesis Via Shallow Diffusion Mechanism
2021 Β· Jinglin Liu, Chengxi Li, Yi Ren, et al.
Abstract
Singing voice synthesis (SVS) systems are built to synthesize high-quality and expressive singing voice, in which the acoustic model generates the acoustic features (e.g., mel-spectrogram) given a music score. Previous singing acoustic models adopt a simple loss (e.g., L1 and L2) or generative adversarial network (GAN) to reconstruct the acoustic features, while they suffer from over-smoothing and unstable training issues respectively, which hinder the naturalness of synthesized singing. In this work, we propose DiffSinger, an acoustic model for SVS based on the diffusion probabilistic model. DiffSinger is a parameterized Markov chain that iteratively converts the noise into mel-spectrogram conditioned on the music score. By implicitly optimizing variational bound, DiffSinger can be stably trained and generate realistic outputs. To further improve the voice quality and speed up inference, we introduce a shallow diffusion mechanism to make better use of the prior knowledge learned by th
Authors
(none)
Tags
Stats
Related papers
- Mandarin Singing Voice Synthesis With Denoising Diffusion Probabilistic Wasserstein GAN (2022)6.34
- Hiddensinger: High-quality Singing Voice Synthesis Via Neural Audio Codec And Latent Diffusion Models (2023)0.00
- Sifisinger: A High-fidelity End-to-end Singing Voice Synthesizer Based On Source-filter Model (2024)4.52
- Makesinger: A Semi-supervised Training Method For Data-efficient Singing Voice Synthesis Via Classifier-free Diffusion Guidance (2024)4.52
- Consinger: Efficient High-fidelity Singing Voice Generation With Minimal Steps (2024)2.26
- Visinger: Variational Inference With Adversarial Learning For End-to-end Singing Voice Synthesis (2021)12.99
- Visinger 2: High-fidelity End-to-end Singing Voice Synthesis Enhanced By Digital Signal Processing Synthesizer (2022)0.00
- Ddsp-based Singing Vocoders: A New Subtractive-based Synthesizer And A Comprehensive Evaluation (2022)0.00