Imetricgan: Intelligibility Enhancement For Speech-in-noise Using Generative Adversarial Network-based Metric Learning
2020 Β· Haoyu Li, Szu-Wei Fu, Yu Tsao, et al.
Abstract
The intelligibility of natural speech is seriously degraded when exposed to adverse noisy environments. In this work, we propose a deep learning-based speech modification method to compensate for the intelligibility loss, with the constraint that the root mean square (RMS) level and duration of the speech signal are maintained before and after modifications. Specifically, we utilize an iMetricGAN approach to optimize the speech intelligibility metrics with generative adversarial networks (GANs). Experimental results show that the proposed iMetricGAN outperforms conventional state-of-the-art algorithms in terms of objective measures, i.e., speech intelligibility in bits (SIIB) and extended short-time objective intelligibility (ESTOI), under a Cafeteria noise condition. In addition, formal listening tests reveal significant intelligibility gains when both noise and reverberation exist.
Authors
(none)
Tags
Stats
Related papers
- Multi-metric Optimization Using Generative Adversarial Networks For Near-end Speech Intelligibility Enhancement (2021)8.60
- Metricgan: Generative Adversarial Networks Based Black-box Metric Scores Optimization For Speech Enhancement (2019)0.00
- Metricgan-u: Unsupervised Speech Enhancement/ Dereverberation Based Only On Noisy/ Reverberated Speech (2021)11.67
- CMGAN: Conformer-based Metric-gan For Monaural Speech Enhancement (2022)14.80
- Conditional Generative Adversarial Networks For Speech Enhancement And Noise-robust Speaker Verification (2017)16.03
- Exploring Speech Enhancement With Generative Adversarial Networks For Robust Speech Recognition (2017)16.14
- On The Behavior Of Intrusive And Non-intrusive Speech Enhancement Metrics In Predictive And Generative Settings (2023)0.00
- Towards Generalized Speech Enhancement With Generative Adversarial Networks (2019)10.35