Bivocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction And Waveform Generation
2024 Β· Hui-Peng Du, Ye-Xin Lu, Yang Ai, et al.
Abstract
This paper proposes a novel bidirectional neural vocoder, named BiVocoder, capable both of feature extraction and reverse waveform generation within the short-time Fourier transform (STFT) domain. For feature extraction, the BiVocoder takes amplitude and phase spectra derived from STFT as inputs, transforms them into long-frame-shift and low-dimensional features through convolutional neural networks. The extracted features are demonstrated suitable for direct prediction by acoustic models, supporting its application in text-to-speech (TTS) task. For waveform generation, the BiVocoder restores amplitude and phase spectra from the features by a symmetric network, followed by inverse STFT to reconstruct the speech waveform. Experimental results show that our proposed BiVocoder achieves better performance compared to some baseline vocoders, by comprehensively considering both synthesized speech quality and inference speed for both analysis-synthesis and TTS tasks.
Authors
(none)
Tags
Stats
Related papers
- Estvocoder: An Excitation-spectral-transformed Neural Vocoder Conditioned On Mel Spectrogram (2024)0.00
- Univnet: A Neural Vocoder With Multi-resolution Spectrogram Discriminators For High-fidelity Waveform Generation (2021)14.80
- A Neural Denoising Vocoder For Clean Waveform Generation From Noisy Mel-spectrogram Based On Amplitude And Phase Predictions (2024)0.00
- Mathematical Vocoder Algorithm : Modified Spectral Inversion For Efficient Neural Speech Synthesis (2021)0.00
- A Neural Vocoder With Hierarchical Generation Of Amplitude And Phase Spectra For Statistical Parametric Speech Synthesis (2019)10.74
- Fastfit: Towards Real-time Iterative Neural Vocoder By Replacing U-net Encoder With Multiple Stfts (2023)0.00
- Featherwave: An Efficient High-fidelity Neural Vocoder With Multi-band Linear Prediction (2020)8.35
- Apnet: An All-frame-level Neural Vocoder Incorporating Direct Prediction Of Amplitude And Phase Spectra (2023)9.59