Source-filter Hifi-gan: Fast And Pitch Controllable High-fidelity Neural Vocoder
2022 Β· Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda
Abstract
Our previous work, the unified source-filter GAN (uSFGAN) vocoder, introduced a novel architecture based on the source-filter theory into the parallel waveform generative adversarial network to achieve high voice quality and pitch controllability. However, the high temporal resolution inputs result in high computation costs. Although the HiFi-GAN vocoder achieves fast high-fidelity voice generation thanks to the efficient upsampling-based generator architecture, the pitch controllability is severely limited. To realize a fast and pitch-controllable high-fidelity neural vocoder, we introduce the source-filter theory into HiFi-GAN by hierarchically conditioning the resonance filtering network on a well-estimated source excitation information. According to the experimental results, our proposed method outperforms HiFi-GAN and uSFGAN on a singing voice generation in voice quality and synthesis speed on a single CPU. Furthermore, unlike the uSFGAN vocoder, the proposed method can be easily
Authors
(none)
Tags
Stats
Related papers
- Source-filter-based Generative Adversarial Neural Vocoder For High Fidelity Speech Synthesis (2023)3.58
- Unified Source-filter GAN With Harmonic-plus-noise Source Excitation Generation (2022)0.00
- Unified Source-filter GAN: Unified Source-filter Network Based On Factorization Of Quasi-periodic Parallel Wavegan (2021)7.81
- Hiftnet: A Fast High-quality Neural Vocoder With Harmonic-plus-noise Filter And Inverse Short Time Fourier Transform (2023)0.00
- Hifi-wavegan: Generative Adversarial Network With Auxiliary Spectrogram-phase Loss For High-fidelity Singing Voice Generation (2022)0.00
- Hifi-gan: Generative Adversarial Networks For Efficient And High Fidelity Speech Synthesis (2020)0.00
- Vocgan: A High-fidelity Real-time Vocoder With A Hierarchically-nested Adversarial Network (2020)12.54
- Hifi++: A Unified Framework For Bandwidth Extension And Speech Enhancement (2022)11.93