Flowmac: Conditional Flow Matching For Audio Coding At Low Bit Rates
2024 Β· Nicola Pia, Martin Strauss, Markus Multrus, et al.
Abstract
This paper introduces FlowMAC, a novel neural audio codec for high-quality general audio compression at low bit rates based on conditional flow matching (CFM). FlowMAC jointly learns a mel spectrogram encoder, quantizer and decoder. At inference time the decoder integrates a continuous normalizing flow via an ODE solver to generate a high-quality mel spectrogram. This is the first time that a CFM-based approach is applied to general audio coding, enabling a scalable, simple and memory efficient training. Our subjective evaluations show that FlowMAC at 3 kbps achieves similar quality as state-of-the-art GAN-based and DDPM-based neural audio codecs at double the bit rate. Moreover, FlowMAC offers a tunable inference pipeline, which permits to trade off complexity and quality. This enables real-time coding on CPU, while maintaining high perceptual quality.
Authors
(none)
Tags
Stats
Related papers
- Flowdec: A Flow-based Full-band General Audio Codec With High Perceptual Quality (2025)0.00
- Flowhigh: Towards Efficient And High-quality Audio Super-resolution With Single-step Flow Matching (2025)5.84
- Flowvocoder: A Small Footprint Neural Vocoder Based Normalizing Flow For Speech Synthesis (2021)0.00
- Flowavse: Efficient Audio-visual Speech Enhancement With Conditional Flow Matching (2024)0.00
- Audio Dequantization For High Fidelity Audio Generation In Flow-based Neural Vocoder (2020)6.77
- Accelerating High-fidelity Waveform Generation Via Adversarial Flow Matching Optimization (2024)4.69
- Flashaudio: Rectified Flows For Fast And High-fidelity Text-to-audio Generation (2024)5.13
- Mdctcodec: A Lightweight Mdct-based Neural Audio Codec Towards High Sampling Rate And Low Bitrate Scenarios (2024)8.09