Advancing Connectionist Temporal Classification With Attention Modeling
2018 Β· Amit Das, Jinyu Li, Rui Zhao, et al.
Abstract
In this study, we propose advancing all-neural speech recognition by directly incorporating attention modeling within the Connectionist Temporal Classification (CTC) framework. In particular, we derive new context vectors using time convolution features to model attention as part of the CTC network. To further improve attention modeling, we utilize content information extracted from a network representing an implicit language model. Finally, we introduce vector based attention weights that are applied on context vectors across both time and their individual components. We evaluate our system on a 3400 hours Microsoft Cortana voice assistant task and demonstrate that our proposed model consistently outperforms the baseline model achieving about 20% relative reduction in word error rates.
Authors
(none)
Tags
Stats
Related papers
- Self-attention Networks For Connectionist Temporal Classification In Speech Recognition (2019)14.55
- An Improved Hybrid Ctc-attention Model For Speech Recognition (2018)0.00
- Towards Personalization Of CTC Speech Recognition Models With Contextual Adapters And Adaptive Boosting (2022)0.00
- Advances In All-neural Speech Recognition (2016)11.29
- A Neural Attention Model For Speech Command Recognition (2018)0.00
- Attention-based Contextual Language Model Adaptation For Speech Recognition (2021)0.00
- Advances In Joint Ctc-attention Based End-to-end Speech Recognition With A Deep CNN Encoder And RNN-LM (2017)16.49
- End-to-end Contextual Asr Based On Posterior Distribution Adaptation For Hybrid Ctc/attention System (2022)0.00