An Investigation Of Pre-upsampling Generative Modelling And Generative Adversarial Networks In Audio Super Resolution
2021 · James King, Ramon Viñas Torné, Alexander Campbell, et al.
Abstract
There have been several successful deep learning models that perform audio super-resolution. Many of these approaches involve using preprocessed feature extraction which requires a lot of domain-specific signal processing knowledge to implement. Convolutional Neural Networks (CNNs) improved upon this framework by automatically learning filters. An example of a convolutional approach is AudioUNet, which takes inspiration from novel methods of upsampling images. Our paper compares the pre-upsampling AudioUNet to a new generative model that upsamples the signal before using deep learning to transform it into a more believable signal. Based on the EDSR network for image super-resolution, the newly proposed model outperforms UNet with a 20% increase in log spectral distance and a mean opinion score of 4.06 compared to 3.82 for the two times upsampling case. AudioEDSR also has 87% fewer parameters than AudioUNet. How incorporating AudioUNet into a Wasserstein GAN (with gradient penalty) (WGA
Authors
(none)
Tags
Stats
Related papers
- Bandwidth Extension On Raw Audio Via Generative Adversarial Networks (2019)0.00
- Bigwavgan: A Wave-to-wave Generative Adversarial Network For Music Super-resolution (2023)0.00
- NU-GAN: High Resolution Neural Upsampling With GAN (2020)0.00
- EVA-GAN: Enhanced Various Audio Generation Via Scalable Generative Adversarial Networks (2024)0.00
- Phase-aware Music Super-resolution Using Generative Adversarial Networks (2020)9.59
- Audio Super Resolution Using Neural Networks (2017)0.00
- A Unified Neural Architecture For Instrumental Audio Tasks (2019)0.00
- Hifi-sr: A Unified Generative Transformer-convolutional Adversarial Network For High-fidelity Speech Super-resolution (2025)10.81