Speech Enhancement Via Two-stage Dual Tree Complex Wavelet Packet Transform With A Speech Presence Probability Estimator
2016 Β· Pengfei Sun, Jun Qin
Abstract
In this paper, a two-stage dual tree complex wavelet packet transform (DTCWPT) based speech enhancement algorithm has been proposed, in which a speech presence probability (SPP) estimator and a generalized minimum mean squared error (MMSE) estimator are developed. To overcome the drawback of signal distortions caused by down sampling of WPT, a two-stage analytic decomposition concatenating undecimated WPT (UWPT) and decimated WPT is employed. An SPP estimator in the DTCWPT domain is derived based on a generalized Gamma distribution of speech, and Gaussian noise assumption. The validation results show that the proposed algorithm can obtain enhanced perceptual evaluation of speech quality (PESQ), and segmental signal-to-noise ratio (SegSNR) at low SNR nonstationary noise, compared with other four state-of-the-art speech enhancement algorithms, including optimally modified LSA (OM-LSA), soft masking using a posteriori SNR uncertainty (SMPO), a posteriori SPP based MMSE estimation (MMSE-SP
Authors
(none)
Tags
Stats
Related papers
- Learning-based A Posteriori Speech Presence Probability Estimation And Applications (2025)0.00
- Multi-task Single Channel Speech Enhancement Using Speech Presence Probability As A Secondary Task Training Target (2020)4.52
- Dnn-based Speech Presence Probability Estimation For Multi-frame Single-microphone Speech Enhancement (2019)8.82
- Speech Enhancement Based On Reducing The Detail Portion Of Speech Spectrograms In Modulation Domain Via Discrete Wavelet Transform (2018)6.34
- Speech Enhancement With Perceptually-motivated Optimization And Dual Transformations (2022)0.00
- Magnitude-phase Dual-path Speech Enhancement Network Based On Self-supervised Embedding And Perceptual Contrast Stretch Boosting (2025)3.21
- Enhancement Of Noisy Speech With Low Speech Distortion Based On Probabilistic Geometric Spectral Subtraction (2018)0.00
- Modeling Of Teager Energy Operated Perceptual Wavelet Packet Coefficients With An Erlang-2 PDF For Real Time Enhancement Of Noisy Speech (2018)0.00