Multi-scale Sub-band Constant-q Transform Discriminator For High-fidelity Vocoder
2023 Β· Yicheng Gu, Xueyao Zhang, Liumeng Xue, et al.
Abstract
Generative Adversarial Network (GAN) based vocoders are superior in inference speed and synthesis quality when reconstructing an audible waveform from an acoustic representation. This study focuses on improving the discriminator to promote GAN-based vocoders. Most existing time-frequency-representation-based discriminators are rooted in Short-Time Fourier Transform (STFT), whose time-frequency resolution in a spectrogram is fixed, making it incompatible with signals like singing voices that require flexible attention for different frequency bands. Motivated by that, our study utilizes the Constant-Q Transform (CQT), which owns dynamic resolution among frequencies, contributing to a better modeling ability in pitch accuracy and harmonic tracking. Specifically, we propose a Multi-Scale Sub-Band CQT (MS-SB-CQT) Discriminator, which operates on the CQT spectrogram at multiple scales and performs sub-band processing according to different octaves. Experiments conducted on both speech and si
Authors
(none)
Tags
Stats
Related papers
- An Investigation Of Time-frequency Representation Discriminators For High-fidelity Vocoder (2024)3.58
- Vnet: A Gan-based Multi-tier Discriminator Network For Speech Synthesis Vocoders (2024)2.26
- A Multi-scale Time-frequency Spectrogram Discriminator For Gan-based Non-autoregressive TTS (2022)6.77
- Vocgan: A High-fidelity Real-time Vocoder With A Hierarchically-nested Adversarial Network (2020)12.54
- Avocodo: Generative Adversarial Network For Artifact-free Vocoder (2022)9.41
- VQCPC-GAN: Variable-length Adversarial Audio Synthesis Using Vector-quantized Contrastive Predictive Coding (2021)5.84
- Efficient Non-autoregressive GAN Voice Conversion Using Vqwav2vec Features And Dynamic Convolution (2022)0.00
- Collective Learning Mechanism Based Optimal Transport Generative Adversarial Network For Non-parallel Voice Conversion (2025)0.00