Codec-superb @ SLT 2024: A Lightweight Benchmark For Neural Audio Codec Models
2024 Β· Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, et al.
Abstract
Neural audio codec models are becoming increasingly important as they serve as tokenizers for audio, enabling efficient transmission or facilitating speech language modeling. The ideal neural audio codec should maintain content, paralinguistics, speaker characteristics, and audio information even at low bitrates. Recently, numerous advanced neural codec models have been proposed. However, codec models are often tested under varying experimental conditions. As a result, we introduce the Codec-SUPERB challenge at SLT 2024, designed to facilitate fair and lightweight comparisons among existing codec models and inspire advancements in the field. This challenge brings together representative speech applications and objective metrics, and carefully selects license-free datasets, sampling them into small sets to reduce evaluation computation costs. This paper presents the challenge's rules, datasets, five participant systems, results, and findings.
Authors
(none)
Tags
Stats
Related papers
- Investigating Neural Audio Codecs For Speech Language Model-based Speech Generation (2024)2.26
- Neural Speech And Audio Coding: Modern AI Technology Meets Traditional Codecs (2024)7.16
- Pscodec: A Series Of High-fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders (2024)0.00
- Openace: An Open Benchmark For Evaluating Audio Coding Performance (2024)2.16
- Funcodec: A Fundamental, Reproducible And Integrable Open-source Toolkit For Neural Speech Codec (2023)17.47
- Codecslime: Temporal Redundancy Compression Of Neural Speech Codec Via Dynamic Frame Rate (2025)0.00
- Espnet-codec: Comprehensive Training And Evaluation Of Neural Codecs For Audio, Music, And Speech (2024)9.03
- SUPERB @ SLT 2022: Challenge On Generalization And Efficiency Of Self-supervised Speech Representation Learning (2022)9.23