Attention-based Sequence-to-sequence Model For Speech Recognition: Development Of State-of-the-art System On Librispeech And Its Application To Non-native English
2018 · Yan Yin, Ramon Prieto, Bin Wang, et al.
Abstract
Recent research has shown that attention-based sequence-to-sequence models such as Listen, Attend, and Spell (LAS) yield comparable results to state-of-the-art ASR systems on various tasks. In this paper, we describe the development of such a system and demonstrate its performance on two tasks: first we achieve a new state-of-the-art word error rate of 3.43% on the test clean subset of LibriSpeech English data; second on non-native English speech, including both read speech and spontaneous speech, we obtain very competitive results compared to a conventional system built with the most updated Kaldi recipe.
Authors
(none)
Tags
Stats
Related papers
- State-of-the-art Speech Recognition With Sequence-to-sequence Models (2017)21.01
- An Online Attention-based Model For Speech Recognition (2018)9.59
- RWTH ASR Systems For Librispeech: Hybrid Vs Attention -- W/o Data Augmentation (2019)15.34
- Listen Attentively, And Spell Once: Whole Sentence Generation Via A Non-autoregressive Architecture For Low-latency Speech Recognition (2020)10.07
- English Accent Accuracy Analysis In A State-of-the-art Automatic Speech Recognition System (2021)0.00
- Attention Based End To End Speech Recognition For Voice Search In Hindi And English (2021)6.77
- Audio-attention Discriminative Language Model For ASR Rescoring (2019)9.23
- An Improved Hybrid Ctc-attention Model For Speech Recognition (2018)0.00