Vectorization Of Hypotheses And Speech For Faster Beam Search In Encoder Decoder-based Speech Recognition
2018 Β· Hiroshi Seki, Takaaki Hori, Shinji Watanabe
Abstract
Attention-based encoder decoder network uses a left-to-right beam search algorithm in the inference step. The current beam search expands hypotheses and traverses the expanded hypotheses at the next time step. This traversal is implemented using a for-loop program in general, and it leads to speed down of the recognition process. In this paper, we propose a parallelism technique for beam search, which accelerates the search process by vectorizing multiple hypotheses to eliminate the for-loop program. We also propose a technique to batch multiple speech utterances for off-line recognition use, which reduces the for-loop program with regard to the traverse of multiple utterances. This extension is not trivial during beam search unlike during training due to several pruning and thresholding techniques for efficient decoding. In addition, our method can combine scores of external modules, RNNLM and CTC, in a batch as shallow fusion. We achieved 3.7 x speedup compared with the original beam
Authors
(none)
Tags
Stats
Related papers
- Segment-level Vectorized Beam Search Based On Partially Autoregressive Inference (2023)0.00
- Streaming Parallel Transducer Beam Search With Fast-slow Cascaded Encoders (2022)0.00
- Robust Beam Search For Encoder-decoder Attention Based Speech Recognition Without Length Bias (2020)4.52
- Integration Of Frame- And Label-synchronous Beam Search For Streaming Encoder-decoder Speech Recognition (2023)0.00
- Navigating The Minefield Of MT Beam Search In Cascaded Streaming Speech Translation (2024)3.58
- A Fully Differentiable Beam Search Decoder (2019)0.00
- Joint Beam Search Integrating CTC, Attention, And Transducer Decoders (2024)5.24
- Label-looping: Highly Efficient Decoding For Transducers (2024)4.52