A DNN Based Post-filter To Enhance The Quality Of Coded Speech In MDCT Domain
2022 · Kishan Gupta, Srikanth Korse, Bernd Edler, et al.
Abstract
Frequency domain processing, and in particular the use of Modified Discrete Cosine Transform (MDCT), is the most widespread approach to audio coding. However, at low bitrates, audio quality, especially for speech, degrades drastically due to the lack of available bits to directly code the transform coefficients. Traditionally, post-filtering has been used to mitigate artefacts in the coded speech by exploiting a-priori information of the source and extra transmitted parameters. Recently, data-driven post-filters have shown better results, but at the cost of significant additional complexity and delay. In this work, we propose a mask-based post-filter operating directly in MDCT domain of the codec, inducing no extra delay. The real-valued mask is applied to the quantized MDCT coefficients and is estimated from a relatively lightweight convolutional encoder-decoder network. Our solution is tested on the recently standardized low-delay, low-complexity codec (LC3) at lowest possible bitrat
Authors
(none)
Tags
Stats
Related papers
- Enhancement Of Coded Speech Using A Mask-based Post-filter (2020)8.82
- Mdctcodec: A Lightweight Mdct-based Neural Audio Codec Towards High Sampling Rate And Low Bitrate Scenarios (2024)8.09
- Postgan: A Gan-based Post-processor To Enhance The Quality Of Coded Speech (2022)9.76
- Real-time Monaural Speech Enhancement With Short-time Discrete Cosine Transform (2021)0.00
- LACE: A Light-weight, Causal Model For Enhancing Coded Speech Through Adaptive Convolutions (2023)0.00
- Nolace: Improving Low-complexity Speech Codec Enhancement Through Adaptive Temporal Shaping (2023)7.16
- Concatenated Identical DNN (CI-DNN) To Reduce Noise-type Dependence In Dnn-based Speech Enhancement (2018)5.24
- End-to-end Speech Enhancement Based On Discrete Cosine Transform (2019)8.09