Run-and-back Stitch Search: Novel Block Synchronous Decoding For Streaming Encoder-decoder ASR
2022 Β· Emiru Tsunoo, Chaitanya Narisetty, Michael Hentschel, et al.
Abstract
A streaming style inference of encoder-decoder automatic speech recognition (ASR) system is important for reducing latency, which is essential for interactive use cases. To this end, we propose a novel blockwise synchronous decoding algorithm with a hybrid approach that combines endpoint prediction and endpoint post-determination. In the endpoint prediction, we compute the expectation of the number of tokens that are yet to be emitted in the encoder features of the current blocks using the CTC posterior. Based on the expectation value, the decoder predicts the endpoint to realize continuous block synchronization, as a running stitch. Meanwhile, endpoint post-determination probabilistically detects backward jump of the source-target attention, which is caused by the misprediction of endpoints. Then it resumes decoding by discarding those hypotheses, as back stitch. We combine these methods into a hybrid approach, namely run-and-back stitch search, which reduces the computational cost an
Authors
(none)
Tags
Stats
Related papers
- Streaming Parallel Transducer Beam Search With Fast-slow Cascaded Encoders (2022)0.00
- Integration Of Frame- And Label-synchronous Beam Search For Streaming Encoder-decoder Speech Recognition (2023)0.00
- Segment-level Vectorized Beam Search Based On Partially Autoregressive Inference (2023)0.00
- Minimum Latency Training Strategies For Streaming Sequence-to-sequence ASR (2020)10.07
- High Performance Sequence-to-sequence Model For Streaming Speech Recognition (2020)3.58
- Cascaded Encoders For Unifying Streaming And Non-streaming ASR (2020)12.47
- Joint Optimization Of Streaming And Non-streaming Automatic Speech Recognition With Multi-decoder And Knowledge Distillation (2024)0.00
- Blockwise Streaming Transformer For Spoken Language Understanding And Simultaneous Speech Translation (2022)4.52