An Explicit Consistency-preserving Loss Function For Phase Reconstruction And Speech Enhancement
2024 Β· Pin-Jui Ku, Chun-Wei Ho, Hao Yen, et al.
Abstract
In this work, we propose a novel consistency-preserving loss function for recovering the phase information in the context of phase reconstruction (PR) and speech enhancement (SE). Different from conventional techniques that directly estimate the phase using a deep model, our idea is to exploit ad-hoc constraints to directly generate a consistent pair of magnitude and phase. Specifically, the proposed loss forces a set of complex numbers to be a consistent short-time Fourier transform (STFT) representation, i.e., to be the spectrogram of a real signal. Our approach thus avoids the difficulty of estimating the original phase, which is highly unstructured and sensitive to time shift. The influence of our proposed loss is first assessed on a PR task, experimentally demonstrating that our approach is viable. Next, we show its effectiveness on an SE task, using both the VB-DMD and WSJ0-CHiME3 data sets. On VB-DMD, our approach is competitive with conventional solutions. On the challenging WS
Authors
(none)
Tags
Stats
Related papers
- Phase Continuity: Learning Derivatives Of Phase Spectrum For Speech Enhancement (2022)6.77
- A Consolidated View Of Loss Functions For Supervised Deep Learning-based Speech Enhancement (2020)13.93
- Explicit Estimation Of Magnitude And Phase Spectra In Parallel For High-quality Speech Enhancement (2023)11.19
- End-to-end Model For Speech Enhancement By Consistent Spectrogram Masking (2019)0.00
- End-to-end Speech Separation With Unfolded Iterative Phase Reconstruction (2018)15.00
- Phase-aware Speech Enhancement With Deep Complex U-net (2019)0.00
- Magnitude-and-phase-aware Speech Enhancement With Parallel Sequence Modeling (2023)3.58
- Phase Aware Speech Enhancement Using Realisation Of Complex-valued LSTM (2020)0.00