Denoising Auto-encoder With Recurrent Skip Connections And Residual Regression For Music Source Separation
2018 Β· Jen-Yu Liu, Yi-Hsuan Yang
Abstract
Convolutional neural networks with skip connections have shown good performance in music source separation. In this work, we propose a denoising Auto-encoder with Recurrent skip Connections (ARC). We use 1D convolution along the temporal axis of the time-frequency feature map in all layers of the fully-convolutional network. The use of 1D convolution makes it possible to apply recurrent layers to the intermediate outputs of the convolution layers. In addition, we also propose an enhancement network and a residual regression method to further improve the separation result. The recurrent skip connections, the enhancement module, and the residual regression all improve the separation quality. The ARC model with residual regression achieves 5.74 siganl-to-distoration ratio (SDR) in vocals with MUSDB in SiSEC 2018. We also evaluate the ARC model alone on the older dataset DSD100 (used in SiSEC 2016) and it achieves 5.91 SDR in vocals.
Authors
(none)
Tags
Stats
Related papers
- A Recurrent Encoder-decoder Approach With Skip-filtering Connections For Monaural Singing Voice Separation (2017)9.41
- Voice And Accompaniment Separation In Music Using Self-attention Convolutional Neural Network (2020)0.00
- Mmdenselstm: An Efficient Combination Of Convolutional And Recurrent Neural Networks For Audio Source Separation (2018)15.28
- Raw Multi-channel Audio Source Separation Using Multi-resolution Convolutional Auto-encoders (2018)11.58
- Depthwise Separable Convolutions Versus Recurrent Neural Networks For Monaural Singing Voice Separation (2020)0.00
- Source Separation And Depthwise Separable Convolutions For Computer Audition (2020)0.00
- Examining The Mapping Functions Of Denoising Autoencoders In Singing Voice Separation (2019)8.35
- Monaural Singing Voice Separation With Skip-filtering Connections And Recurrent Inference Of Time-frequency Mask (2017)10.07