Efficient Conformer With Prob-sparse Attention Mechanism For End-to-endspeech Recognition
2021 Β· Xiong Wang, Sining Sun, Lei Xie, et al.
Abstract
End-to-end models are favored in automatic speech recognition (ASR) because of their simplified system structure and superior performance. Among these models, Transformer and Conformer have achieved state-of-the-art recognition accuracy in which self-attention plays a vital role in capturing important global information. However, the time and memory complexity of self-attention increases squarely with the length of the sentence. In this paper, a prob-sparse self-attention mechanism is introduced into Conformer to sparse the computing process of self-attention in order to accelerate inference speed and reduce space consumption. Specifically, we adopt a Kullback-Leibler divergence based sparsity measurement for each query to decide whether we compute the attention function on this query. By using the prob-sparse attention mechanism, we achieve impressively 8% to 45% inference speed-up and 15% to 45% memory usage reduction of the self-attention module of Conformer Transducer while maintai
Authors
(none)
Tags
Stats
Related papers
- Fast Conformer With Linearly Scalable Attention For Efficient Speech Recognition (2023)14.47
- Efficient Conformer: Progressive Downsampling And Grouped Attention For Automatic Speech Recognition (2021)13.79
- Attention-based ASR With Lightweight And Dynamic Convolutions (2019)9.03
- Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition (2023)3.58
- Nextformer: A Convnext Augmented Conformer For End-to-end Speech Recognition (2022)0.00
- Generalizing Rnn-transducer To Out-domain Audio Via Sparse Self-attention Layers (2021)6.34
- Towards A Unified Conformer Structure: From ASR To ASV Task (2022)13.11
- Improving Hybrid Ctc/attention End-to-end Speech Recognition With Pretrained Acoustic And Language Model (2021)8.82