Singgan: Generative Adversarial Network For High-fidelity Singing Voice Generation
2021 Β· Rongjie Huang, Chenye Cui, Feiyang Chen, et al.
Abstract
Deep generative models have achieved significant progress in speech synthesis to date, while high-fidelity singing voice synthesis is still an open problem for its long continuous pronunciation, rich high-frequency parts, and strong expressiveness. Existing neural vocoders designed for text-to-speech cannot directly be applied to singing voice synthesis because they result in glitches and poor high-frequency reconstruction. In this work, we propose SingGAN, a generative adversarial network designed for high-fidelity singing voice synthesis. Specifically, 1) to alleviate the glitch problem in the generated samples, we propose source excitation with the adaptive feature learning filters to expand the receptive field patterns and stabilize long continuous signal generation; and 2) SingGAN introduces global and local discriminators at different scales to enrich low-frequency details and promote high-frequency reconstruction; and 3) To improve the training efficiency, SingGAN includes auxil
Authors
(none)
Tags
Stats
Related papers
- Xiaoicesing 2: A High-fidelity Singing Voice Synthesizer Based On Generative Adversarial Network (2022)0.00
- Hifi-wavegan: Generative Adversarial Network With Auxiliary Spectrogram-phase Loss For High-fidelity Singing Voice Generation (2022)0.00
- Wgansing: A Multi-voice Singing Voice Synthesizer Based On The Wasserstein-gan (2019)11.08
- Instructsing: High-fidelity Singing Voice Generation Via Instructing Yourself (2024)0.00
- Mandarin Singing Voice Synthesis With Denoising Diffusion Probabilistic Wasserstein GAN (2022)6.34
- Adversarially Trained Multi-singer Sequence-to-sequence Singing Synthesizer (2020)7.81
- SVSGAN: Singing Voice Separation Via Generative Adversarial Network (2017)0.00
- Improving Adversarial Waveform Generation Based Singing Voice Conversion With Harmonic Signals (2022)7.50