Fast Offline Transformer-based End-to-end Automatic Speech Recognition For Real-world Applications
2021 Β· Yoo Rhee Oh, Kiyoung Park, Jeon Gyu Park
Abstract
With the recent advances in technology, automatic speech recognition (ASR) has been widely used in real-world applications. The efficiency of converting large amounts of speech into text accurately with limited resources has become more important than ever. This paper proposes a method to rapidly recognize a large speech database via a Transformer-based end-to-end model. Transformers have improved the state-of-the-art performance in many fields. However, they are not easy to use for long sequences. In this paper, various techniques to speed up the recognition of real-world speeches are proposed and tested, including decoding via multiple-utterance batched beam search, detecting end-of-speech based on a connectionist temporal classification (CTC), restricting the CTC prefix score, and splitting long speeches into short segments. Experiments are conducted with the Librispeech English and the real-world Korean ASR tasks to verify the proposed methods. From the experiments, the proposed sy
Authors
(none)
Tags
Stats
Related papers
- Transformer-based Online Ctc/attention End-to-end Speech Recognition Architecture (2020)14.06
- Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers (2021)10.07
- Conv-transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-end Speech Recognition (2020)11.08
- Paraformer: Fast And Accurate Parallel Transformer For Non-autoregressive End-to-end Speech Recognition (2022)15.10
- Developing Real-time Streaming Transformer Transducer For Speech Recognition On Large-scale Dataset (2020)0.00
- Improving Non-autoregressive End-to-end Speech Recognition With Pre-trained Acoustic And Language Models (2022)10.07
- Transformer-based Online Speech Recognition With Decoder-end Adaptive Computation Steps (2020)7.81
- A CTC Alignment-based Non-autoregressive Transformer For End-to-end Automatic Speech Recognition (2023)10.97