Hifi-gan: Generative Adversarial Networks For Efficient And High Fidelity Speech Synthesis
2020 Β· Jungil Kong, Jaehyeon Kim, Jaekyoung Bae
Abstract
Several recent work on speech synthesis have employed generative adversarial networks (GANs) to produce raw waveforms. Although such methods improve the sampling efficiency and memory usage, their sample quality has not yet reached that of autoregressive and flow-based generative models. In this work, we propose HiFi-GAN, which achieves both efficient and high-fidelity speech synthesis. As speech audio consists of sinusoidal signals with various periods, we demonstrate that modeling periodic patterns of an audio is crucial for enhancing sample quality. A subjective human evaluation (mean opinion score, MOS) of a single speaker dataset indicates that our proposed method demonstrates similarity to human quality while generating 22.05 kHz high-fidelity audio 167.9 times faster than real-time on a single V100 GPU. We further show the generality of HiFi-GAN to the mel-spectrogram inversion of unseen speakers and end-to-end speech synthesis. Finally, a small footprint version of HiFi-GAN gen
Authors
(none)
Tags
Stats
Related papers
- Hifi-wavegan: Generative Adversarial Network With Auxiliary Spectrogram-phase Loss For High-fidelity Singing Voice Generation (2022)0.00
- TFGAN: Time And Frequency Domain Based Generative Adversarial Network For High-fidelity Speech Synthesis (2020)0.00
- Hifi++: A Unified Framework For Bandwidth Extension And Speech Enhancement (2022)11.93
- Hiftnet: A Fast High-quality Neural Vocoder With Harmonic-plus-noise Filter And Inverse Short Time Fourier Transform (2023)0.00
- Specdiff-gan: A Spectrally-shaped Noise Diffusion GAN For Speech And Music Synthesis (2024)7.81
- Hifi-sr: A Unified Generative Transformer-convolutional Adversarial Network For High-fidelity Speech Super-resolution (2025)10.81
- High Fidelity Speech Synthesis With Adversarial Networks (2019)0.00
- Hifi-gan: High-fidelity Denoising And Dereverberation Based On Speech Deep Features In Adversarial Networks (2020)0.00