Speech Enhancement With Perceptually-motivated Optimization And Dual Transformations
2022 Β· Xucheng Wan, Kai Liu, Ziqing Du, et al.
Abstract
To address the monaural speech enhancement problem, numerous research studies have been conducted to enhance speech via operations either in time-domain on the inner-domain learned from the speech mixture or in time--frequency domain on the fixed full-band short time Fourier transform (STFT) spectrograms. Very recently, a few studies on sub-band based speech enhancement have been proposed. By enhancing speech via operations on sub-band spectrograms, those studies demonstrated competitive performances on the benchmark dataset of DNS2020. Despite attractive, this new research direction has not been fully explored and there is still room for improvement. As such, in this study, we delve into the latest research direction and propose a sub-band based speech enhancement system with perceptually-motivated optimization and dual transformations, called PT-FSE. Specially, our proposed PT-FSE model improves its backbone, a full-band and sub-band fusion model, by three efforts. First, we design a
Authors
(none)
Tags
Stats
Related papers
- Dpt-fsnet: Dual-path Transformer Based Full-band And Sub-band Fusion Network For Speech Enhancement (2021)0.00
- Forknet: Simultaneous Time And Time-frequency Domain Modeling For Speech Enhancement (2023)0.00
- Time-domain Speech Enhancement Assisted By Multi-resolution Frequency Encoder And Decoder (2023)9.76
- Efficient Encoder-decoder And Dual-path Conformer For Comprehensive Feature Learning In Speech Enhancement (2023)7.16
- SE Territory: Monaural Speech Enhancement Meets The Fixed Virtual Perceptual Space Mapping (2023)0.00
- Dynamic Acoustic Compensation And Adaptive Focal Training For Personalized Speech Enhancement (2022)4.52
- FB-MSTCN: A Full-band Single-channel Speech Enhancement Method Based On Multi-scale Temporal Convolutional Network (2022)6.77
- Lisennet: Lightweight Sub-band And Dual-path Modeling For Real-time Speech Enhancement (2024)9.03