Deep Bayesian Unsupervised Source Separation Based On A Complex Gaussian Mixture Model
2019 · Yoshiaki Bando, Yoko Sasaki, Kazuyoshi Yoshii
Abstract
This paper presents an unsupervised method that trains neural source separation by using only multichannel mixture signals. Conventional neural separation methods require a lot of supervised data to achieve excellent performance. Although multichannel methods based on spatial information can work without such training data, they are often sensitive to parameter initialization and degraded with the sources located close to each other. The proposed method uses a cost function based on a spatial model called a complex Gaussian mixture model (cGMM). This model has the time-frequency (TF) masks and direction of arrivals (DoAs) of sources as latent variables and is used for training separation and localization networks that respectively estimate these variables. This joint training solves the frequency permutation ambiguity of the spatial model in a unified deep Bayesian framework. In addition, the pre-trained network can be used not only for conducting monaural separation but also for effic
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Music Source Separation Using Differentiable Parametric Source Models (2022)10.97
- End-to-end Networks For Supervised Single-channel Speech Separation (2018)0.00
- Multichannel Singing Voice Separation By Deep Neural Network Informed DOA Constrained CNMF (2020)5.84
- Unsupervised Training For Deep Speech Source Separation With Kullback-leibler Divergence Based Probabilistic Loss Function (2019)9.92
- Generalized Multichannel Variational Autoencoder For Underdetermined Source Separation (2018)7.81
- Spatial Loss For Unsupervised Multi-channel Source Separation (2022)7.16
- A Comparison And Combination Of Unsupervised Blind Source Separation Techniques (2021)0.00
- Multichannel Blind Speech Source Separation With A Disjoint Constraint Source Model (2024)0.00