Monaural Singing Voice Separation With Skip-filtering Connections And Recurrent Inference Of Time-frequency Mask
2017 · Stylianos Ioannis Mimilakis, Konstantinos Drossos, João F. Santos, et al.
Abstract
Singing voice separation based on deep learning relies on the usage of time-frequency masking. In many cases the masking process is not a learnable function or is not encapsulated into the deep learning optimization. Consequently, most of the existing methods rely on a post processing step using the generalized Wiener filtering. This work proposes a method that learns and optimizes (during training) a source-dependent mask and does not need the aforementioned post processing step. We introduce a recurrent inference algorithm, a sparse transformation step to improve the mask generation process, and a learned denoising filter. Obtained results show an increase of 0.49 dB for the signal to distortion ratio and 0.30 dB for the signal to interference ratio, compared to previous state-of-the-art approaches for monaural singing voice separation.
Authors
(none)
Tags
Stats
Related papers
- A Recurrent Encoder-decoder Approach With Skip-filtering Connections For Monaural Singing Voice Separation (2017)9.41
- Htmd-net: A Hybrid Masking-denoising Approach To Time-domain Monaural Singing Voice Separation (2021)2.26
- Mad Twinnet: Masker-denoiser Architecture With Twin Networks For Monaural Sound Source Separation (2018)0.00
- Unsupervised Interpretable Representation Learning For Singing Voice Separation (2020)5.84
- Multichannel Singing Voice Separation By Deep Neural Network Informed DOA Constrained CNMF (2020)5.84
- Singing Voice Separation Using A Deep Convolutional Neural Network Trained By Ideal Binary Mask And Cross Entropy (2018)11.19
- Depthwise Separable Convolutions Versus Recurrent Neural Networks For Monaural Singing Voice Separation (2020)0.00
- Voicefilter: Targeted Voice Separation By Speaker-conditioned Spectrogram Masking (2018)17.48