Visinger 2: High-fidelity End-to-end Singing Voice Synthesis Enhanced By Digital Signal Processing Synthesizer
2022 Β· Yongmao Zhang, Heyang Xue, Hanzhao Li, et al.
Abstract
End-to-end singing voice synthesis (SVS) model VISinger can achieve better performance than the typical two-stage model with fewer parameters. However, VISinger has several problems: text-to-phase problem, the end-to-end model learns the meaningless mapping of text-to-phase; glitches problem, the harmonic components corresponding to the periodic signal of the voiced segment occurs a sudden change with audible artefacts; low sampling rate, the sampling rate of 24KHz does not meet the application needs of high-fidelity generation with the full-band rate (44.1KHz or higher). In this paper, we propose VISinger 2 to address these issues by integrating the digital signal processing (DSP) methods with VISinger. Specifically, inspired by recent advances in differentiable digital signal processing (DDSP), we incorporate a DSP synthesizer into the decoder to solve the above issues. The DSP synthesizer consists of a harmonic synthesizer and a noise synthesizer to generate periodic and aperiodic s
Authors
(none)
Tags
Stats
Related papers
- Sifisinger: A High-fidelity End-to-end Singing Voice Synthesizer Based On Source-filter Model (2024)4.52
- Visinger: Variational Inference With Adversarial Learning For End-to-end Singing Voice Synthesis (2021)12.99
- Visinger2+: End-to-end Singing Voice Synthesis Augmented By Self-supervised Learning Representation (2024)4.52
- Diffsinger: Singing Voice Synthesis Via Shallow Diffusion Mechanism (2021)23.76
- Period Singer: Integrating Periodic And Aperiodic Variational Autoencoders For Natural-sounding End-to-end Singing Voice Synthesis (2024)2.26
- Cssinger: End-to-end Chunkwise Streaming Singing Voice Synthesis System Based On Conditional Variational Autoencoder (2024)0.00
- Towards Improving The Expressiveness Of Singing Voice Synthesis With BERT Derived Semantic Information (2023)0.00
- Consinger: Efficient High-fidelity Singing Voice Generation With Minimal Steps (2024)2.26