Speech Quality Factors For Traditional And Neural-based Low Bit Rate Vocoders
2020 Β· Wissam A. Jassim, Jan Skoglund, Michael Chinen, et al.
Abstract
This study compares the performances of different algorithms for coding speech at low bit rates. In addition to widely deployed traditional vocoders, a selection of recently developed generative-model-based coders at different bit rates are contrasted. Performance analysis of the coded speech is evaluated for different quality aspects: accuracy of pitch periods estimation, the word error rates for automatic speech recognition, and the influence of speaker gender and coding delays. A number of performance metrics of speech samples taken from a publicly available database were compared with subjective scores. Results from subjective quality assessment do not correlate well with existing full reference speech quality metrics. The results provide valuable insights into aspects of the speech signal that will be used to develop a novel metric to accurately predict speech quality from generative-model-based coders.
Authors
(none)
Tags
Stats
Related papers
- CQNV: A Combination Of Coarsely Quantized Bitstream And Neural Vocoder For Low Rate Speech Coding (2023)6.34
- Wavenet Based Low Rate Speech Coding (2017)0.00
- Low Bit-rate Wideband Speech Coding: A Deep Generative Model Based Approach (2021)0.00
- Low Bit-rate Speech Coding With VQ-VAE And A Wavenet Decoder (2019)14.80
- Improving Opus Low Bit Rate Quality With Neural Speech Synthesis (2019)10.48
- Composition Of Deep And Spiking Neural Networks For Very Low Bit Rate Speech Coding (2016)9.92
- Investigating Neural Audio Codecs For Speech Language Model-based Speech Generation (2024)2.26
- Neural Feature Predictor And Discriminative Residual Coding For Low-bitrate Speech Coding (2022)6.77