Hierarchical Multitask Learning For Ctc-based Speech Recognition
2018 Β· Kalpesh Krishna, Shubham Toshniwal, Karen Livescu
Abstract
Previous work has shown that neural encoder-decoder speech recognition can be improved with hierarchical multitask learning, where auxiliary tasks are added at intermediate layers of a deep encoder. We explore the effect of hierarchical multitask learning in the context of connectionist temporal classification (CTC)-based speech recognition, and investigate several aspects of this approach. Consistent with previous work, we observe performance improvements on telephone conversational speech recognition (specifically the Eval2000 test sets) when training a subword-level CTC model with an auxiliary phone loss at an intermediate layer. We analyze the effects of a number of experimental variables (like interpolation constant and position of the auxiliary loss function), performance in lower-resource settings, and the relationship between pretraining and multitask learning. We observe that the hierarchical multitask approach improves over standard multitask training in our higher-data exper
Authors
(none)
Tags
Stats
Related papers
- Multitask Learning With CTC And Segmental CRF For Speech Recognition (2017)0.00
- Multilingual Training And Cross-lingual Adaptation On Ctc-based Acoustic Model (2017)0.00
- Hierarchical Conditional End-to-end ASR With CTC And Multi-granular Subword Units (2021)9.23
- Joint Ctc-attention Based End-to-end Speech Recognition Using Multi-task Learning (2016)20.43
- An Improved Hybrid Ctc-attention Model For Speech Recognition (2018)0.00
- Multiple-hypothesis Ctc-based Semi-supervised Adaptation Of End-to-end Speech Recognition (2021)5.84
- Residual Convolutional CTC Networks For Automatic Speech Recognition (2017)0.00
- Multi-encoder Multi-resolution Framework For End-to-end Speech Recognition (2018)0.00