On Lattice-free Boosted MMI Training Of HMM And Ctc-based Full-context ASR Models
2021 Β· Xiaohui Zhang, Vimal Manohar, David Zhang, et al.
Abstract
Hybrid automatic speech recognition (ASR) models are typically sequentially trained with CTC or LF-MMI criteria. However, they have vastly different legacies and are usually implemented in different frameworks. In this paper, by decoupling the concepts of modeling units and label topologies and building proper numerator/denominator graphs accordingly, we establish a generalized framework for hybrid acoustic modeling (AM). In this framework, we show that LF-MMI is a powerful training criterion applicable to both limited-context and full-context models, for wordpiece/mono-char/bi-char/chenone units, with both HMM/CTC topologies. From this framework, we propose three novel training schemes: chenone(ch)/wordpiece(wp)-CTC-bMMI, and wordpiece(wp)-HMM-bMMI with different advantages in training performance, decoding efficiency and decoding time-stamp accuracy. The advantages of different training schemes are evaluated comprehensively on Librispeech, and wp-CTC-bMMI and ch-CTC-bMMI are evaluate
Authors
(none)
Tags
Stats
Related papers
- Consistent Training And Decoding For End-to-end Speech Recognition Using Lattice-free MMI (2021)8.35
- Simplified End-to-end MMI Training And Voting For ASR (2017)0.00
- HMM Vs. CTC For Automatic Speech Recognition: Comparison Based On Full-sum Training From Scratch (2022)0.00
- Multilingual Training And Cross-lingual Adaptation On Ctc-based Acoustic Model (2017)0.00
- Unsupervised Model-based Speaker Adaptation Of End-to-end Lattice-free MMI Model For Speech Recognition (2022)2.26
- A Comparison Of Lattice-free Discriminative Training Criteria For Purely Sequence-trained Neural Network Acoustic Models (2018)4.52
- Full-sum Decoding For Hybrid HMM Based Speech Recognition Using LSTM Language Model (2020)0.00
- Boundary And Context Aware Training For Cif-based Non-autoregressive End-to-end ASR (2021)7.81