Neural Feature Predictor And Discriminative Residual Coding For Low-bitrate Speech Coding
2022 Β· Haici Yang, Wootaek Lim, Minje Kim
Abstract
Low and ultra-low-bitrate neural speech coding achieves unprecedented coding gain by generating speech signals from compact speech features. This paper introduces additional coding efficiency in neural speech coding by reducing the temporal redundancy existing in the frame-level feature sequence via a recurrent neural predictor. The prediction can achieve a low-entropy residual representation, which we discriminatively code based on their contribution to the signal reconstruction. The harmonization of feature prediction and discriminative coding results in a dynamic bit allocation algorithm that spends more bits on unpredictable but rare events. As a result, we develop a scalable, lightweight, low-latency, and low-bitrate neural speech coding system. We demonstrate the advantage of the proposed methods using the LPCNet as a neural vocoder. While the proposed method guarantees causality in its prediction, the subjective tests and feature space analysis show that our model achieves super
Authors
(none)
Tags
Stats
Related papers
- Latent-domain Predictive Neural Speech Coding (2022)12.15
- Composition Of Deep And Spiking Neural Networks For Very Low Bit Rate Speech Coding (2016)9.92
- A Robust Frame-based Nonlinear Prediction System For Automatic Speech Coding (2016)0.00
- Freecodec: A Disentangled Neural Speech Codec With Fewer Tokens (2024)4.52
- A Real-time Wideband Neural Vocoder At 1.6 Kb/s Using Lpcnet (2019)12.61
- Optimizing Neural Speech Codec For Low-bitrate Compression Via Multi-scale Encoding (2024)0.00
- Disentangled Feature Learning For Real-time Neural Speech Coding (2022)0.00
- Pscodec: A Series Of High-fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders (2024)0.00