Hybrid Ctc-attention Based End-to-end Speech Recognition Using Subword Units
2018 Β· Zhangyu Xiao, Zhijian Ou, Wei Chu, et al.
Abstract
In this paper, we present an end-to-end automatic speech recognition system, which successfully employs subword units in a hybrid CTC-Attention based system. The subword units are obtained by the byte-pair encoding (BPE) compression algorithm. Compared to using words as modeling units, using characters or subword units does not suffer from the out-of-vocabulary (OOV) problem. Furthermore, using subword units further offers a capability in modeling longer context than using characters. We evaluate different systems over the LibriSpeech 1000h dataset. The subword-based hybrid CTC-Attention system obtains 6.8% word error rate (WER) on the test_clean subset without any dictionary or external language model. This represents a significant improvement (a 12.8% WER relative reduction) over the character-based hybrid CTC-Attention system.
Authors
(none)
Tags
Stats
Related papers
- An Improved Hybrid Ctc-attention Model For Speech Recognition (2018)0.00
- An Investigation Of Phone-based Subword Units For End-to-end Speech Recognition (2020)9.59
- Subword And Crossword Units For CTC Acoustic Models (2017)8.60
- Acoustic Data-driven Subword Modeling For End-to-end Speech Recognition (2021)6.77
- A Systematic Comparison Of Grapheme-based Vs. Phoneme-based Label Units For Encoder-decoder-attention Models (2020)0.00
- Advancing CTC-CRF Based End-to-end Speech Recognition With Wordpieces And Conformers (2021)0.00
- Faster, Simpler And More Accurate Hybrid ASR Systems Using Wordpieces (2020)9.41
- Towards End-to-end Code-switching Speech Recognition (2018)0.00