How Does End-to-end Speech Recognition Training Impact Speech Enhancement Artifacts?
2023 Β· Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, et al.
Abstract
Jointly training a speech enhancement (SE) front-end and an automatic speech recognition (ASR) back-end has been investigated as a way to mitigate the influence of *processing distortion* generated by single-channel SE on ASR. In this paper, we investigate the effect of such joint training on the signal-level characteristics of the enhanced signals from the viewpoint of the decomposed noise and artifact errors. The experimental analyses provide two novel findings: 1) ASR-level training of the SE front-end reduces the artifact errors while increasing the noise errors, and 2) simply interpolating the enhanced and observed signals, which achieves a similar effect of reducing artifacts and increasing noise, improves ASR performance without jointly modifying the SE and ASR modules, even for a strong ASR back-end using a WavLM feature extractor. Our findings provide a better understanding of the effect of joint training and a novel insight for designing an ASR agnostic SE front-end.
Authors
(none)
Tags
Stats
Related papers
- Rethinking Processing Distortions: Disentangling The Impact Of Speech Enhancement Errors On Speech Recognition Performance (2024)8.35
- Joint Training Of Speech Enhancement And Self-supervised Model For Noise-robust ASR (2022)0.00
- Snri Target Training For Joint Speech Enhancement And Recognition (2021)8.82
- How Bad Are Artifacts?: Analyzing The Impact Of Speech Enhancement Errors On ASR (2022)13.17
- Bridging The Gap: Integrating Pre-trained Speech Enhancement And Recognition Models For Robust Speech Recognition (2024)7.50
- Human Listening And Live Captioning: Multi-task Training For Speech Enhancement (2021)9.92
- Towards Decoupling Frontend Enhancement And Backend Recognition In Monaural Robust ASR (2024)4.52
- Speech And Noise Dual-stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition (2023)5.24