Abstract

Generative adversarial networks have recently demonstrated outstanding performance in neural vocoding outperforming best autoregressive and flow-based models. In this paper, we show that this success can be extended to other tasks of conditional audio generation. In particular, building upon HiFi vocoders, we propose a novel HiFi++ general framework for bandwidth extension and speech enhancement. We show that with the improved generator architecture, HiFi++ performs better or comparably with the state-of-the-art in these tasks while spending significantly less computational resources. The effectiveness of our approach is validated through a series of extensive experiments.

Authors

(none)

Tags

  • Speech Enhancement
  • Audio Generation

Stats

  • citations38
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score11.93
  • arxiv keyandreev2022hifi

Related papers