Investigation Of Monaural Front-end Processing For Robust ASR Without Retraining Or Joint-training
2018 Β· Zhihao Du, Xueliang Zhang, Jiqing Han
Abstract
In recent years, monaural speech separation has been formulated as a supervised learning problem, which has been systematically researched and shown the dramatical improvement of speech intelligibility and quality for human listeners. However, it has not been well investigated whether the methods can be employed as the front-end processing and directly improve the performance of a machine listener, i.e., an automatic speech recognizer, without retraining or joint-training the acoustic model. In this paper, we explore the effectiveness of the independent front-end processing for the multi-conditional trained ASR on the CHiME-3 challenge. We find that directly feeding the enhanced features to ASR can make 36.40% and 11.78% relative WER reduction for the GMM-based and DNN-based ASR respectively. We also investigate the affect of noisy phase and generalization ability under unmatched noise condition.
Authors
(none)
Tags
Stats
Related papers
- Towards Decoupling Frontend Enhancement And Backend Recognition In Monaural Robust ASR (2024)4.52
- Speaker Reinforcement Using Target Source Extraction For Robust Automatic Speech Recognition (2022)7.50
- Investigation Of Practical Aspects Of Single Channel Speech Separation For ASR (2021)7.81
- On Monoaural Speech Enhancement For Automatic Recognition Of Real Noisy Speech Using Mixture Invariant Training (2022)4.52
- Improving Noise Robust Automatic Speech Recognition With Single-channel Time-domain Enhancement Network (2020)13.88
- Time-domain Speech Enhancement For Robust Automatic Speech Recognition (2022)7.16
- End-to-end Monaural Multi-speaker ASR System Without Pretraining (2018)11.93
- A Conformer-based ASR Frontend For Joint Acoustic Echo Cancellation, Speech Enhancement And Speech Separation (2021)9.23