Leveraging Heteroscedastic Uncertainty In Learning Complex Spectral Mapping For Single-channel Speech Enhancement
2022 Β· Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, et al.
Abstract
Most speech enhancement (SE) models learn a point estimate and do not make use of uncertainty estimation in the learning process. In this paper, we show that modeling heteroscedastic uncertainty by minimizing a multivariate Gaussian negative log-likelihood (NLL) improves SE performance at no extra cost. During training, our approach augments a model learning complex spectral mapping with a temporary submodel to predict the covariance of the enhancement error at each time-frequency bin. Due to unrestricted heteroscedastic uncertainty, the covariance introduces an undersampling effect, detrimental to SE performance. To mitigate undersampling, our approach inflates the uncertainty lower bound and weights each loss component with their uncertainty, effectively compensating severely undersampled components with more penalties. Our multivariate setting reveals common covariance assumptions such as scalar and diagonal matrices. By weakening these assumptions, we show that the NLL achieves sup
Authors
(none)
Tags
Stats
Related papers
- Integrating Statistical Uncertainty Into Neural Network-based Speech Enhancement (2022)6.34
- Semi-supervised Multichannel Speech Enhancement With Variational Autoencoders And Non-negative Matrix Factorization (2018)12.25
- Single-channel Speech Enhancement With Deep Complex U-networks And Probabilistic Latent Space Models (2023)5.24
- A Speech Enhancement Algorithm Based On Non-negative Hidden Markov Model And Kullback-leibler Divergence (2020)5.84
- Single-channel Speech Enhancement Using Learnable Loss Mixup (2023)0.00
- Incorporating Symbolic Sequential Modeling For Speech Enhancement (2019)0.00
- A Weighted-variance Variational Autoencoder Model For Speech Enhancement (2022)0.00
- SE Territory: Monaural Speech Enhancement Meets The Fixed Virtual Perceptual Space Mapping (2023)0.00