Phase Aware Speech Enhancement Using Realisation Of Complex-valued LSTM
2020 Β· Raktim Gautam Goswami, Sivaganesh Andhavarapu, K Sri Rama Murty
Abstract
Most of the deep learning based speech enhancement (SE) methods rely on estimating the magnitude spectrum of the clean speech signal from the observed noisy speech signal, either by magnitude spectral masking or regression. These methods reuse the noisy phase while synthesizing the time-domain waveform from the estimated magnitude spectrum. However, there have been recent works highlighting the importance of phase in SE. There was an attempt to estimate the complex ratio mask taking phase into account using complex-valued feed-forward neural network (FFNN). But FFNNs cannot capture the sequential information essential for phase estimation. In this work, we propose a realisation of complex-valued long short-term memory (RCLSTM) network to estimate the complex ratio mask (CRM) using sequential information along time. The proposed RCLSTM is designed to process the complex-valued sequences using complex arithmetic, and hence it preserves the dependencies between the real and imaginary part
Authors
(none)
Tags
Stats
Related papers
- Magnitude-and-phase-aware Speech Enhancement With Parallel Sequence Modeling (2023)3.58
- DCCRN: Deep Complex Convolution Recurrent Network For Phase-aware Speech Enhancement (2020)20.78
- Phase-aware Speech Enhancement With Deep Complex U-net (2019)0.00
- Phase-incorporating Speech Enhancement Based On Complex-valued Gaussian Process Latent Variable Model (2016)0.00
- Explicit Estimation Of Magnitude And Phase Spectra In Parallel For High-quality Speech Enhancement (2023)11.19
- End-to-end Model For Speech Enhancement By Consistent Spectrogram Masking (2019)0.00
- DCCRGAN: Deep Complex Convolution Recurrent Generator Adversarial Network For Speech Enhancement (2020)0.00
- A Deep Representation Learning-based Speech Enhancement Method Using Complex Convolution Recurrent Variational Autoencoder (2023)7.16