Loss Prediction: End-to-end Active Learning Approach For Speech Recognition
2021 Β· Jian Luo, Jianzong Wang, Ning Cheng, et al.
Abstract
End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is complicated and expensive. Active learning is the solution by selecting the most valuable samples for annotation. In this paper, we proposed to use a predicted loss that estimates the uncertainty of the sample. The CTC (Connectionist Temporal Classification) and attention loss are informative for speech recognition since they are computed based on all decoding paths and alignments. We defined an end-to-end active learning pipeline, training an ASR/LP (Automatic Speech Recognition/Loss Prediction) joint model. The proposed approach was validated on an English and a Chinese speech recognition task. The experiments show that our approach achieves competitive results, outperforming random selection, least confidence, and estimated loss method.
Authors
(none)
Tags
Stats
Related papers
- Boosting Active Learning For Speech Recognition With Noisy Pseudo-labeled Samples (2020)0.00
- Multiple-hypothesis Ctc-based Semi-supervised Adaptation Of End-to-end Speech Recognition (2021)5.84
- End-to-end Automatic Speech Recognition Integrated With Ctc-based Voice Activity Detection (2020)11.76
- Joint Ctc-attention Based End-to-end Speech Recognition Using Multi-task Learning (2016)20.43
- Advances In Joint Ctc-attention Based End-to-end Speech Recognition With A Deep CNN Encoder And RNN-LM (2017)16.49
- Unsupervised Active Learning: Optimizing Labeling Cost-effectiveness For Automatic Speech Recognition (2023)0.00
- Continual Learning For Monolingual End-to-end Automatic Speech Recognition (2021)7.16
- A CTC Alignment-based Non-autoregressive Transformer For End-to-end Automatic Speech Recognition (2023)10.97