Complexdec: A Domain-robust High-fidelity Neural Audio Codec With Complex Spectrum Modeling
2025 · Yi-Chiao Wu, Dejan Marković, Steven Krenn, et al.
Abstract
Neural audio codecs have been widely adopted in audio-generative tasks because their compact and discrete representations are suitable for both large-language-model-style and regression-based generative models. However, most neural codecs struggle to model out-of-domain audio, resulting in error propagations to downstream generative tasks. In this paper, we first argue that information loss from codec compression degrades out-of-domain robustness. Then, we propose full-band 48~kHz ComplexDec with complex spectral input and output to ease the information loss while adopting the same 24~kbps bitrate as the baseline AuidoDec and ScoreDec. Objective and subjective evaluations demonstrate the out-of-domain robustness of ComplexDec trained using only the 30-hour VCTK corpus.
Authors
(none)
Tags
Stats
Related papers
- Mdctcodec: A Lightweight Mdct-based Neural Audio Codec Towards High Sampling Rate And Low Bitrate Scenarios (2024)8.09
- Scoredec: A Phase-preserving High-fidelity Audio Codec With A Generalized Score-based Diffusion Post-filter (2024)5.84
- Code Drift: Towards Idempotent Neural Audio Codecs (2024)2.26
- NDVQ: Robust Neural Audio Codec With Normal Distribution-based Vector Quantization (2024)0.00
- Stftcodec: High-fidelity Audio Compression Through Time-frequency Domain Representation (2025)2.26
- Towards Evaluating Generative Audio: Insights From Neural Audio Codec Embedding Distances (2025)0.00
- Semanticodec: An Ultra Low Bitrate Semantic Audio Codec For General Sound (2024)10.97
- Neural Speech And Audio Coding: Modern AI Technology Meets Traditional Codecs (2024)7.16