DRED: Deep Redundancy Coding Of Speech Using A Rate-distortion-optimized Variational Autoencoder
2022 · Jean-Marc Valin, Jan Büthe, Ahmed Mustafa, et al.
Abstract
Despite recent advancements in packet loss concealment (PLC) using deep learning techniques, packet loss remains a significant challenge in real-time speech communication. Redundancy has been used in the past to recover the missing information during losses. However, conventional redundancy techniques are limited in the maximum loss duration they can cover and are often unsuitable for burst packet loss. We propose a new approach based on a rate-distortion-optimized variational autoencoder (RDO-VAE), allowing us to optimize a deep speech compression algorithm for the task of encoding large amounts of redundancy at very low bitrate. The proposed Deep REDundancy (DRED) algorithm can transmit up to 50x redundancy using less than 32 kb/s. Results show that DRED outperforms the existing Opus codec redundancy. We also demonstrate its benefits when operating in the context of WebRTC.
Authors
(none)
Tags
Stats
Related papers
- Deep Vocoder: Low Bit Rate Compression Of Speech With Deep Autoencoder (2019)5.24
- Codecslime: Temporal Redundancy Compression Of Neural Speech Codec Via Dynamic Frame Rate (2025)0.00
- RVAE-EM: Generative Speech Dereverberation Based On Recurrent Variational Auto-encoder And Convolutive Transfer Function (2023)7.50
- Low Bit-rate Speech Coding With VQ-VAE And A Wavenet Decoder (2019)14.80
- Feedback Recurrent Autoencoder (2019)7.16
- Speech Prediction Using An Adaptive Recurrent Neural Network With Application To Packet Loss Concealment (2021)11.19
- Variable Bitrate Residual Vector Quantization For Audio Coding (2024)3.58
- Enhancing Into The Codec: Noise Robust Speech Coding With Vector-quantized Autoencoders (2021)10.21