Towards High-quality And Efficient Speech Bandwidth Extension With Parallel Amplitude And Phase Prediction
2024 Β· Ye-Xin Lu, Yang Ai, Hui-Peng Du, et al.
Abstract
Speech bandwidth extension (BWE) refers to widening the frequency bandwidth range of speech signals, enhancing the speech quality towards brighter and fuller. This paper proposes a generative adversarial network (GAN) based BWE model with parallel prediction of Amplitude and Phase spectra, named AP-BWE, which achieves both high-quality and efficient wideband speech waveform generation. The proposed AP-BWE generator is entirely based on convolutional neural networks (CNNs). It features a dual-stream architecture with mutual interaction, where the amplitude stream and the phase stream communicate with each other and respectively extend the high-frequency components from the input narrowband amplitude and phase spectra. To improve the naturalness of the extended speech signals, we employ a multi-period discriminator at the waveform level and design a pair of multi-resolution amplitude and phase discriminators at the spectral level, respectively. Experimental results demonstrate that our p
Authors
(none)
Tags
Stats
Related papers
- Dsp-informed Bandwidth Extension Using Locally-conditioned Excitation And Linear Time-varying Filter Subnetworks (2024)2.26
- Multi-stage Speech Bandwidth Extension With Flexible Sampling Rate Control (2024)6.34
- UBGAN: Enhancing Coded Speech With Blind And Guided Bandwidth Extension (2025)0.00
- Speech Bandwidth Expansion Via High Fidelity Generative Adversarial Networks (2024)0.00
- Waveform Modeling And Generation Using Hierarchical Recurrent Neural Networks For Speech Bandwidth Extension (2018)12.99
- Apnet2: High-quality And High-efficiency Neural Vocoder With Direct Prediction Of Amplitude And Phase Spectra (2023)6.34
- Bae-net: A Low Complexity And High Fidelity Bandwidth-adaptive Neural Network For Speech Super-resolution (2023)6.77
- Generative Adversarial Network Based Speaker Adaptation For High Fidelity Wavenet Vocoder (2018)5.84