Deformable Temporal Convolutional Networks For Monaural Noisy Reverberant Speech Separation
2022 Β· William Ravenscroft, Stefan Goetze, Thomas Hain
Abstract
Speech separation models are used for isolating individual speakers in many speech processing applications. Deep learning models have been shown to lead to state-of-the-art (SOTA) results on a number of speech separation benchmarks. One such class of models known as temporal convolutional networks (TCNs) has shown promising results for speech separation tasks. A limitation of these models is that they have a fixed receptive field (RF). Recent research in speech dereverberation has shown that the optimal RF of a TCN varies with the reverberation characteristics of the speech signal. In this work deformable convolution is proposed as a solution to allow TCN models to have dynamic RFs that can adapt to various reverberation times for reverberant speech separation. The proposed models are capable of achieving an 11.1 dB average scale-invariant signalto-distortion ratio (SISDR) improvement over the input signal on the WHAMR benchmark. A relatively small deformable TCN model of 1.3M paramete
Authors
(none)
Tags
Stats
Related papers
- Receptive Field Analysis Of Temporal Convolutional Networks For Monaural Speech Dereverberation (2022)6.34
- Utterance Weighted Multi-dilation Temporal Convolutional Networks For Monaural Speech Dereverberation (2022)7.50
- On Time Domain Conformer Models For Monaural Speech Separation In Noisy Reverberant Acoustic Environments (2023)5.84
- Furcanext: End-to-end Monaural Speech Separation With Dynamic Gated Dilated Temporal Convolutional Networks (2019)12.40
- Two-stage Model And Optimal SI-SNR For Monaural Multi-speaker Speech Separation In Noisy Environment (2020)0.00
- Monaural Speech Enhancement Using A Multi-branch Temporal Convolutional Network (2019)3.58
- Single Channel Speech Enhancement Using Temporal Convolutional Recurrent Neural Networks (2020)5.84
- TFCN: Temporal-frequential Convolutional Network For Single-channel Speech Enhancement (2022)0.00