Monaural Source Separation: From Anechoic To Reverberant Environments
2021 Β· Tobias Cord-Landwehr, Christoph Boeddeker, Thilo von Neumann, et al.
Abstract
Impressive progress in neural network-based single-channel speech source separation has been made in recent years. But those improvements have been mostly reported on anechoic data, a situation that is hardly met in practice. Taking the SepFormer as a starting point, which achieves state-of-the-art performance on anechoic mixtures, we gradually modify it to optimize its performance on reverberant mixtures. Although this leads to a word error rate improvement by 7 percentage points compared to the standard SepFormer implementation, the system ends up with only marginally better performance than a PIT-BLSTM separation system, that is optimized with rather straightforward means. This is surprising and at the same time sobering, challenging the practical usefulness of many improvements reported in recent years for monaural source separation on nonreverberant data.
Authors
(none)
Tags
Stats
Related papers
- End-to-end Networks For Supervised Single-channel Speech Separation (2018)0.00
- Consep: A Noise- And Reverberation-robust Speech Separation Framework By Magnitude Conditioning (2024)0.00
- End-to-end Dereverberation, Beamforming, And Speech Recognition With Improved Numerical Stability And Advanced Frontend (2021)10.97
- Independence-based Joint Dereverberation And Separation With Neural Source Model (2021)4.52
- Investigation Of Practical Aspects Of Single Channel Speech Separation For ASR (2021)7.81
- Short-time Deep-learning Based Source Separation For Speech Enhancement In Reverberant Environments With Beamforming (2020)0.00
- WHAMR!: Noisy And Reverberant Single-channel Speech Separation (2019)16.10
- Investigation Of Monaural Front-end Processing For Robust ASR Without Retraining Or Joint-training (2018)0.00