More For Less: Non-intrusive Speech Quality Assessment With Limited Annotations
2021 Β· Alessandro Ragano, Emmanouil Benetos, Andrew Hines
Abstract
Non-intrusive speech quality assessment is a crucial operation in multimedia applications. The scarcity of annotated data and the lack of a reference signal represent some of the main challenges for designing efficient quality assessment metrics. In this paper, we propose two multi-task models to tackle the problems above. In the first model, we first learn a feature representation with a degradation classifier on a large dataset. Then we perform MOS prediction and degradation classification simultaneously on a small dataset annotated with MOS. In the second approach, the initial stage consists of learning features with a deep clustering-based unsupervised feature representation on the large dataset. Next, we perform MOS prediction and cluster label classification simultaneously on a small dataset. The results show that the deep clustering-based model outperforms the degradation classifier-based model and the 3 baselines (autoencoder features, P.563, and SRMRnorm) on TCD-VoIP. This pap
Authors
(none)
Tags
Stats
Related papers
- Metricnet: Towards Improved Modeling For Non-intrusive Speech Quality Assessment (2021)0.00
- Non-intrusive Speech Quality Assessment Using Neural Networks (2019)13.74
- Multi-task Pseudo-label Learning For Non-intrusive Speech Quality Assessment Model (2023)0.00
- Ccatmos: Convolutional Context-aware Transformer Network For Non-intrusive Speech Quality Assessment (2022)5.24
- MMMOS: Multi-domain Multi-axis Audio Quality Assessment (2025)0.00
- Quality-net: An End-to-end Non-intrusive Speech Quality Assessment Model Based On BLSTM (2018)15.62
- Attentivemos: A Lightweight Attention-only Model For Speech Quality Prediction (2024)3.58
- Multi-cmgan+/+: Leveraging Multi-objective Speech Quality Metric Prediction For Speech Enhancement (2023)0.00