Towards Generalized Speech Enhancement With Generative Adversarial Networks
2019 Β· Santiago Pascual, Joan SerrΓ , Antonio Bonafonte
Abstract
The speech enhancement task usually consists of removing additive noise or reverberation that partially mask spoken utterances, affecting their intelligibility. However, little attention is drawn to other, perhaps more aggressive signal distortions like clipping, chunk elimination, or frequency-band removal. Such distortions can have a large impact not only on intelligibility, but also on naturalness or even speaker identity, and require of careful signal reconstruction. In this work, we give full consideration to this generalized speech enhancement task, and show it can be tackled with a time-domain generative adversarial network (GAN). In particular, we extend a previous GAN-based speech enhancement system to deal with mixtures of four types of aggressive distortions. Firstly, we propose the addition of an adversarial acoustic regression loss that promotes a richer feature extraction at the discriminator. Secondly, we also make use of a two-step adversarial training schedule, acting
Authors
(none)
Tags
Stats
Related papers
- SEGAN: Speech Enhancement Generative Adversarial Network (2017)21.85
- Exploring Speech Enhancement With Generative Adversarial Networks For Robust Speech Recognition (2017)16.14
- Multi-metric Optimization Using Generative Adversarial Networks For Near-end Speech Intelligibility Enhancement (2021)8.60
- Dynamic Attention Based Generative Adversarial Network With Phase Post-processing For Speech Enhancement (2020)0.00
- On Enhancing Speech Emotion Recognition Using Generative Adversarial Networks (2018)12.33
- VSEGAN: Visual Speech Enhancement Generative Adversarial Network (2021)8.60
- Robust Speech Recognition Using Generative Adversarial Networks (2017)11.29
- Boosting Noise Robustness Of Acoustic Model Via Deep Adversarial Training (2018)9.23