Integrating Source-channel And Attention-based Sequence-to-sequence Models For Speech Recognition
2019 Β· Qiujia Li, Chao Zhang, Philip C. Woodland
Abstract
This paper proposes a novel automatic speech recognition (ASR) framework called Integrated Source-Channel and Attention (ISCA) that combines the advantages of traditional systems based on the noisy source-channel model (SC) and end-to-end style systems using attention-based sequence-to-sequence models. The traditional SC system framework includes hidden Markov models and connectionist temporal classification (CTC) based acoustic models, language models (LMs), and a decoding procedure based on a lexicon, whereas the end-to-end style attention-based system jointly models the whole process with a single model. By rescoring the hypotheses produced by traditional systems using end-to-end style systems based on an extended noisy source-channel model, ISCA allows structured knowledge to be easily incorporated via the SC-based model while exploiting the complementarity of the attention-based model. Experiments on the AMI meeting corpus show that ISCA is able to give a relative word error rate
Authors
(none)
Tags
Stats
Related papers
- Audio-attention Discriminative Language Model For ASR Rescoring (2019)9.23
- State-of-the-art Speech Recognition With Sequence-to-sequence Models (2017)21.01
- End-to-end Multichannel Speaker-attributed ASR: Speaker Guided Decoder And Input Feature Analysis (2023)0.00
- Multi-stream End-to-end Speech Recognition (2019)8.35
- End-to-end Multimodal Speech Recognition (2018)10.21
- An Online Attention-based Model For Speech Recognition (2018)9.59
- Combining Frame-synchronous And Label-synchronous Systems For Speech Recognition (2021)0.00
- Stream Attention-based Multi-array End-to-end Speech Recognition (2018)0.00