Phase-incorporating Speech Enhancement Based On Complex-valued Gaussian Process Latent Variable Model
2016 Β· Sih-Huei Chen, Yuan-Shan Lee, Jia-Ching Wang
Abstract
Traditional speech enhancement techniques modify the magnitude of a speech in time-frequency domain, and use the phase of a noisy speech to resynthesize a time domain speech. This work proposes a complex-valued Gaussian process latent variable model (CGPLVM) to enhance directly the complex-valued noisy spectrum, modifying not only the magnitude but also the phase. The main idea that underlies the developed method is the modeling of short-time Fourier transform (STFT) coefficients across the time frames of a speech as a proper complex Gaussian process (GP) with noise added. The proposed method is based on projecting the spectrum into a low-dimensional subspace. The likelihood criterion is used to optimize the hyperparameters of the model. Experiments were carried out on the CHTTL database, which contains the digits zero to nine in Mandarin. Several standard measures are used to demonstrate that the proposed method outperforms baseline methods.
Authors
(none)
Tags
Stats
Related papers
- Enhancement Of Noisy Speech With Low Speech Distortion Based On Probabilistic Geometric Spectral Subtraction (2018)0.00
- Phase Aware Speech Enhancement Using Realisation Of Complex-valued LSTM (2020)0.00
- Phase-aware Speech Enhancement With Deep Complex U-net (2019)0.00
- End-to-end Model For Speech Enhancement By Consistent Spectrogram Masking (2019)0.00
- Single-channel Speech Enhancement With Deep Complex U-networks And Probabilistic Latent Space Models (2023)5.24
- Magnitude-and-phase-aware Speech Enhancement With Parallel Sequence Modeling (2023)3.58
- Speech Enhancement Using Separable Polling Attention And Global Layer Normalization Followed With Prelu (2021)0.00
- Explicit Estimation Of Magnitude And Phase Spectra In Parallel For High-quality Speech Enhancement (2023)11.19