End-to-end Source Separation With Adaptive Front-ends
2017 Β· Shrikant Venkataramani, Jonah Casebeer, Paris Smaragdis
Abstract
Source separation and other audio applications have traditionally relied on the use of short-time Fourier transforms as a front-end frequency domain representation step. The unavailability of a neural network equivalent to forward and inverse transforms hinders the implementation of end-to-end learning systems for these applications. We present an auto-encoder neural network that can act as an equivalent to short-time front-end transforms. We demonstrate the ability of the network to learn optimal, real-valued basis functions directly from the raw waveform of a signal and further show how it can be used as an adaptive front-end for supervised source separation. In terms of separation performance, these transforms significantly outperform their Fourier counterparts. Finally, we also propose a novel source to distortion ratio based cost function for end-to-end source separation.
Authors
(none)
Tags
Stats
Related papers
- End-to-end Non-negative Autoencoders For Sound Source Separation (2019)2.26
- End-to-end Networks For Supervised Single-channel Speech Separation (2018)0.00
- Raw Multi-channel Audio Source Separation Using Multi-resolution Convolutional Auto-encoders (2018)11.58
- Wave-u-net: A Multi-scale Neural Network For End-to-end Audio Source Separation (2018)0.00
- Neural Network Alternatives To Convolutive Audio Models For Source Separation (2017)0.00
- A Style Transfer Approach To Source Separation (2019)3.58
- Independence-based Joint Dereverberation And Separation With Neural Source Model (2021)4.52
- Distortion-controlled Training For End-to-end Reverberant Speech Separation With Auxiliary Autoencoding Loss (2020)5.84