Nu-wave 2: A General Neural Audio Upsampling Model For Various Sampling Rates
2022 Β· Seungu Han, Junhyeok Lee
Abstract
Conventionally, audio super-resolution models fixed the initial and the target sampling rates, which necessitate the model to be trained for each pair of sampling rates. We introduce NU-Wave 2, a diffusion model for neural audio upsampling that enables the generation of 48 kHz audio signals from inputs of various sampling rates with a single model. Based on the architecture of NU-Wave, NU-Wave 2 uses short-time Fourier convolution (STFC) to generate harmonics to resolve the main failure modes of NU-Wave, and incorporates bandwidth spectral feature transform (BSFT) to condition the bandwidths of inputs in the frequency domain. We experimentally demonstrate that NU-Wave 2 produces high-resolution audio regardless of the sampling rate of input while requiring fewer parameters than other models. The official code and the audio samples are available at https://mindslab-ai.github.io/nuwave2.
Authors
(none)
Tags
Stats
Related papers
- Nu-wave: A Diffusion Probabilistic Model For Neural Audio Upsampling (2021)12.40
- NU-GAN: High Resolution Neural Upsampling With GAN (2020)0.00
- Audio Super Resolution Using Neural Networks (2017)0.00
- MSR-NV: Neural Vocoder Using Multiple Sampling Rates (2021)2.26
- Wave-u-net: A Multi-scale Neural Network For End-to-end Audio Source Separation (2018)0.00
- Universr: Unified And Versatile Audio Super-resolution Via Vocoder-free Flow Matching (2025)0.00
- An Investigation Of Pre-upsampling Generative Modelling And Generative Adversarial Networks In Audio Super Resolution (2021)0.00
- Efficient Neural Audio Synthesis (2018)0.00