Deepvqe: Real Time Deep Voice Quality Enhancement For Joint Acoustic Echo Cancellation, Noise Suppression And Dereverberation
2023 Β· Evgenii Indenbom, Nicolae-Catalin Ristea, Ando Saabas, et al.
Abstract
Acoustic echo cancellation (AEC), noise suppression (NS) and dereverberation (DR) are an integral part of modern full-duplex communication systems. As the demand for teleconferencing systems increases, addressing these tasks is required for an effective and efficient online meeting experience. Most prior research proposes solutions for these tasks separately, combining them with digital signal processing (DSP) based components, resulting in complex pipelines that are often impractical to deploy in real-world applications. This paper proposes a real-time cross-attention deep model, named DeepVQE, based on residual convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to simultaneously address AEC, NS, and DR. We conduct several ablation studies to analyze the contributions of different components of our model to the overall performance. DeepVQE achieves state-of-the-art performance on non-personalized tracks from the ICASSP 2023 Acoustic Echo Cancellation Challenge a
Authors
(none)
Tags
Stats
Related papers
- Neuralecho: A Self-attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement (2022)0.00
- I-DCCRN-VAE: An Improved Deep Representation Learning Framework For Complex Vae-based Single-channel Speech Enhancement (2025)0.00
- Deep Residual Echo Suppression And Noise Reduction: A Multi-input FCRN Approach In A Hybrid Speech Enhancement System (2021)8.09
- A Universally-deployable ASR Frontend For Joint Acoustic Echo Cancellation, Speech Enhancement, And Voice Separation (2022)5.84
- Implicit Acoustic Echo Cancellation For Keyword Spotting And Device-directed Speech Detection (2021)3.58
- Joint Neural AEC And Beamforming With Double-talk Detection (2021)3.58
- Multi-task Deep Residual Echo Suppression With Echo-aware Loss (2022)10.74
- Deep Vocoder: Low Bit Rate Compression Of Speech With Deep Autoencoder (2019)5.24