Toward Speech Separation In The Pre-cocktail Party Problem With Tastas

Abstract

In this note, we propose to use TasTas \cite\{shi2020speech\} for the end-to-end approach to monaural speech separation in the pre-cocktail party problem. Our experiments on the public WSJ0-5mix data corpus results in 10.41dB SDR improvement. If online voice data remixing augmentation \cite\{zeghidour2020wavesplit\} is adopted in training, an 11.14dB SDR improvement can be achieved. We have open-sourced our re-implementation of the DPRNN-TasNet in https://github.com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation, and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.

Toward Speech Separation In The Pre-cocktail Party Problem With Tastas

Abstract

Authors

Tags

Stats

Code

Related papers