Abstract

We provide a speech coding scheme employing a generative model based on SampleRNN that, while operating at significantly lower bitrates, matches or surpasses the perceptual quality of state-of-the-art classic wide-band codecs. Moreover, it is demonstrated that the proposed scheme can provide a meaningful rate-distortion trade-off without retraining. We evaluate the proposed scheme in a series of listening tests and discuss limitations of the approach.

Authors

(none)

Tags

  • Uncategorized

Stats

  • citations35
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score11.67
  • arxiv keyklejsa2018high

Related papers