Injecting Spatial Information For Monaural Speech Enhancement Via Knowledge Distillation
2022 Β· Xinmeng Xu, Weiping Tu, Yuhong Yang
Abstract
Monaural speech enhancement (SE) provides a versatile and cost-effective approach to SE tasks by utilizing recordings from a single microphone. However, the monaural SE lags performance behind multi-channel SE as the monaural SE methods are unable to extract spatial information from one-channel recordings, which greatly limits their application scenarios. To address this issue, we inject spatial information into the monaural SE model and propose a knowledge distillation strategy to enable the monaural SE model to learn binaural speech features from the binaural SE model, which makes monaural SE model possible to reconstruct higher intelligibility and quality enhanced speeches under low signal-to-noise ratio (SNR) conditions. Extensive experiments show that our proposed monaural SE model by injecting spatial information via knowledge distillation achieves favorable performance against other monaural SE models with fewer parameters.
Authors
(none)
Tags
Stats
Related papers
- SE Territory: Monaural Speech Enhancement Meets The Fixed Virtual Perceptual Space Mapping (2023)0.00
- End-to-end Multi-channel Speaker Extraction And Binaural Speech Synthesis (2024)0.00
- Real-time Stereo Speech Enhancement With Spatial-cue Preservation Based On Dual-path Structure (2024)5.84
- Bridging The Gap Between Monaural Speech Enhancement And Recognition With Distortion-independent Acoustic Modeling (2019)7.50
- Efficient Multi-channel Speech Enhancement With Spherical Harmonics Injection For Directional Encoding (2023)3.58
- Exploring The Potential Of Data-driven Spatial Audio Enhancement Using A Single-channel Model (2024)0.00
- Integrated Multi-level Knowledge Distillation For Enhanced Speaker Verification (2024)0.00
- Mutual Learning Of Single- And Multi-channel End-to-end Neural Diarization (2022)3.58