Mel-fullsubnet: Mel-spectrogram Enhancement For Improving Both Speech Quality And ASR
2024 Β· Rui Zhou, Xian Li, Ying Fang, et al.
Abstract
In this work, we propose Mel-FullSubNet, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance. Mel-FullSubNet takes as input the noisy and reverberant Mel-spectrogram and predicts the corresponding clean Mel-spectrogram. The enhanced Mel-spectrogram can be either transformed to speech waveform with a neural vocoder or directly used for ASR. Mel-FullSubNet encapsulates interleaved full-band and sub-band networks, for learning the full-band spectral pattern of signals and the sub-band/narrow-band properties of signals, respectively. Compared to linear-frequency domain or time-domain speech enhancement, the major advantage of Mel-spectrogram enhancement is that Mel-frequency presents speech in a more compact way and thus is easier to learn, which will benefit both speech quality and ASR. Experimental results demonstrate a significant improvement in both speech quality and ASR performance
Authors
(none)
Tags
Stats
Related papers
- Fast Fullsubnet: Accelerate Full-band And Sub-band Fusion Model For Single-channel Speech Enhancement (2022)5.56
- Fullsubnet+: Channel Attention Fullsubnet With Complex Spectrograms For Speech Enhancement (2022)15.10
- Fullsubnet: A Full-band And Sub-band Fusion Model For Real-time Single-channel Speech Enhancement (2020)17.09
- Dmf-net: A Decoupling-style Multi-band Fusion Model For Full-band Speech Enhancement (2022)7.16
- High-quality Speech Synthesis Using Super-resolution Mel-spectrogram (2019)0.00
- Speech And Noise Dual-stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition (2023)5.24
- Exploring Speech Enhancement With Generative Adversarial Networks For Robust Speech Recognition (2017)16.14
- Dpt-fsnet: Dual-path Transformer Based Full-band And Sub-band Fusion Network For Speech Enhancement (2021)0.00