A Wavenet For Speech Denoising
2017 Β· Dario Rethage, Jordi Pons, Xavier Serra
Abstract
Currently, most speech processing techniques use magnitude spectrograms as front-end and are therefore by default discarding part of the signal: the phase. In order to overcome this limitation, we propose an end-to-end learning method for speech denoising based on Wavenet. The proposed model adaptation retains Wavenet's powerful acoustic modeling capabilities, while significantly reducing its time-complexity by eliminating its autoregressive nature. Specifically, the model makes use of non-causal, dilated convolutions and predicts target fields instead of a single target sample. The discriminative adaptation of the model we propose, learns in a supervised fashion via minimizing a regression loss. These modifications make the model highly parallelizable during both training and inference. Both computational and perceptual evaluations indicate that the proposed method is preferred to Wiener filtering, a common method based on processing the magnitude spectrogram.
Authors
(none)
Tags
Stats
Related papers
- Speech Denoising By Parametric Resynthesis (2019)7.16
- Deep Speech Denoising With Vector Space Projections (2018)0.00
- Run-time Adaptation Of Neural Beamforming For Robust Speech Dereverberation And Denoising (2024)0.00
- Speech Denoising With Deep Feature Losses (2018)14.23
- Task-specific Optimization Of Virtual Channel Linear Prediction-based Speech Dereverberation Front-end For Far-field Speaker Verification (2021)2.26
- Integrated Speech Enhancement Method Based On Weighted Prediction Error And DNN For Dereverberation And Denoising (2017)0.00
- Neural Network-augmented Kalman Filtering For Robust Online Speech Dereverberation In Noisy Reverberant Environments (2022)0.00
- Towards Speech Enhancement Using A Variational U-net Architecture (2020)7.81