Multi-cmgan+/+: Leveraging Multi-objective Speech Quality Metric Prediction For Speech Enhancement
2023 Β· George Close, William Ravenscroft, Thomas Hain, et al.
Abstract
Neural network based approaches to speech enhancement have shown to be particularly powerful, being able to leverage a data-driven approach to result in a significant performance gain versus other approaches. Such approaches are reliant on artificially created labelled training data such that the neural model can be trained using intrusive loss functions which compare the output of the model with clean reference speech. Performance of such systems when enhancing real-world audio often suffers relative to their performance on simulated test data. In this work, a non-intrusive multi-metric prediction approach is introduced, wherein a model trained on artificial labelled data using inference of an adversarially trained metric prediction neural network. The proposed approach shows improved performance versus state-of-the-art systems on the recent CHiME-7 challenge \ac\{UDASE\} task evaluation sets.
Authors
(none)
Tags
Stats
Related papers
- Metricnet: Towards Improved Modeling For Non-intrusive Speech Quality Assessment (2021)0.00
- Multi-metric Optimization Using Generative Adversarial Networks For Near-end Speech Intelligibility Enhancement (2021)8.60
- On The Behavior Of Intrusive And Non-intrusive Speech Enhancement Metrics In Predictive And Generative Settings (2023)0.00
- Attention-based Speech Enhancement Using Human Quality Perception Modelling (2023)0.00
- Metricgan-u: Unsupervised Speech Enhancement/ Dereverberation Based Only On Noisy/ Reverberated Speech (2021)11.67
- Metricgan: Generative Adversarial Networks Based Black-box Metric Scores Optimization For Speech Enhancement (2019)0.00
- CMGAN: Conformer-based Metric-gan For Monaural Speech Enhancement (2022)14.80
- Multi-modal Hybrid Deep Neural Network For Speech Enhancement (2016)0.00