SEGAN: Speech Enhancement Generative Adversarial Network
2017 Β· Santiago Pascual, Antonio Bonafonte, Joan SerrΓ
Abstract
Current speech enhancement techniques operate on the spectral domain and/or exploit some higher-level feature. The majority of them tackle a limited number of noise conditions and rely on first-order statistics. To circumvent these issues, deep networks are being increasingly used, thanks to their ability to learn complex functions from large example sets. In this work, we propose the use of generative adversarial networks for speech enhancement. In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them. We evaluate the proposed model using an independent, unseen test set with two speakers and 20 alternative noise conditions. The enhanced samples confirm the viability of the proposed model, and both objective and subjective evaluations confirm the effectiveness of it. With that, we open the exploration of generati
Authors
(none)
Tags
Stats
Related papers
- VSEGAN: Visual Speech Enhancement Generative Adversarial Network (2021)8.60
- Towards Generalized Speech Enhancement With Generative Adversarial Networks (2019)10.35
- DCCRGAN: Deep Complex Convolution Recurrent Generator Adversarial Network For Speech Enhancement (2020)0.00
- Conditional Generative Adversarial Networks For Speech Enhancement And Noise-robust Speaker Verification (2017)16.03
- SEFGAN: Harvesting The Power Of Normalizing Flows And Gans For Efficient High-quality Speech Enhancement (2023)5.84
- Multi-metric Optimization Using Generative Adversarial Networks For Near-end Speech Intelligibility Enhancement (2021)8.60
- Exploring Speech Enhancement With Generative Adversarial Networks For Robust Speech Recognition (2017)16.14
- FNSE-SBGAN: Far-field Speech Enhancement With Schrodinger Bridge And Generative Adversarial Networks (2025)3.58