Task And Perception-aware Distributed Source Coding For Correlated Speech Under Bandwidth-constrained Channels
2025 Β· Sagnik Bhattacharya, Muhammad Ahmed Mohsin, Ahsan Bilal, et al.
Abstract
Emerging wireless AR/VR applications require real-time transmission of correlated high-fidelity speech from multiple resource-constrained devices over unreliable, bandwidth-limited channels. Existing autoencoder-based speech source coding methods fail to address the combination of the following - (1) dynamic bitrate adaptation without retraining the model, (2) leveraging correlations among multiple speech sources, and (3) balancing downstream task loss with realism of reconstructed speech. We propose a neural distributed principal component analysis (NDPCA)-aided distributed source coding algorithm for correlated speech sources transmitting to a central receiver. Our method includes a perception-aware downstream task loss function that balances perceptual realism with task-specific performance. Experiments show significant PSNR improvements under bandwidth constraints over naive autoencoder methods in task-agnostic (19%) and task-aware settings (52%). It also approaches the theoretical
Authors
(none)
Tags
Stats
Related papers
- Pscodec: A Series Of High-fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders (2024)0.00
- Spatialcodec: Neural Spatial Speech Coding (2023)3.69
- Neural Feature Predictor And Discriminative Residual Coding For Low-bitrate Speech Coding (2022)6.77
- Multi-channel Opus Compression For Far-field Automatic Speech Recognition With A Fixed Bitrate Budget (2021)5.84
- Rate-adaptive Coding Mechanism For Semantic Communications With Multi-modal Data (2023)11.93
- A Neural Speech Codec For Noise Robust Speech Coding (2023)0.00
- Optimizing Neural Speech Codec For Low-bitrate Compression Via Multi-scale Encoding (2024)0.00
- Phoenixcodec: Taming Neural Speech Coding For Extreme Low-resource Scenarios (2025)0.00