Investigating U-nets With Various Intermediate Blocks For Spectrogram-based Singing Voice Separation
2019 Β· Woosung Choi, Minseok Kim, Jaehwa Chung, et al.
Abstract
Singing Voice Separation (SVS) tries to separate singing voice from a given mixed musical signal. Recently, many U-Net-based models have been proposed for the SVS task, but there were no existing works that evaluate and compare various types of intermediate blocks that can be used in the U-Net architecture. In this paper, we introduce a variety of intermediate spectrogram transformation blocks. We implement U-nets based on these blocks and train them on complex-valued spectrograms to consider both magnitude and phase. These networks are then compared on the SDR metric. When using a particular block composed of convolutional and fully-connected layers, it achieves state-of-the-art SDR on the MUSDB singing voice separation task by a large margin of 0.9 dB. Our code and models are available online.
Authors
(none)
Tags
Stats
Related papers
- Improving Singing Voice Separation Using Deep U-net And Wave-u-net With Data Augmentation (2019)10.35
- Spectrogram-channels U-net: A Source Separation Model Viewing Each Channel As The Spectrogram Of Each Source (2018)0.00
- Investigation Of Singing Voice Separation For Singing Voice Detection In Polyphonic Music (2020)5.84
- Improving Singing Voice Separation With The Wave-u-net Using Minimum Hyperspherical Energy (2019)7.16
- Medleyvox: An Evaluation Dataset For Multiple Singing Voices Separation (2022)10.63
- Wave-u-net: A Multi-scale Neural Network For End-to-end Audio Source Separation (2018)0.00
- Improved Speech Enhancement With The Wave-u-net (2018)0.00
- Multi-band Multi-resolution Fully Convolutional Neural Networks For Singing Voice Separation (2019)5.84