Robust Beam Search For Encoder-decoder Attention Based Speech Recognition Without Length Bias
2020 · Wei Zhou, Ralf Schlüter, Hermann Ney
Abstract
As one popular modeling approach for end-to-end speech recognition, attention-based encoder-decoder models are known to suffer the length bias and corresponding beam problem. Different approaches have been applied in simple beam search to ease the problem, most of which are heuristic-based and require considerable tuning. We show that heuristics are not proper modeling refinement, which results in severe performance degradation with largely increased beam sizes. We propose a novel beam search derived from reinterpreting the sequence posterior with an explicit length modeling. By applying the reinterpreted probability together with beam pruning, the obtained final probability leads to a robust model modification, which allows reliable comparison among output sequences of different lengths. Experimental verification on the LibriSpeech corpus shows that the proposed approach solves the length bias problem without heuristics or additional tuning effort. It provides robust decision making a
Authors
(none)
Tags
Stats
Related papers
- Segment-level Vectorized Beam Search Based On Partially Autoregressive Inference (2023)0.00
- A Fully Differentiable Beam Search Decoder (2019)0.00
- Vectorization Of Hypotheses And Speech For Faster Beam Search In Encoder Decoder-based Speech Recognition (2018)0.00
- Contextualized Automatic Speech Recognition With Attention-based Bias Phrase Boosted Beam Search (2024)8.60
- Joint Beam Search Integrating CTC, Attention, And Transducer Decoders (2024)5.24
- Integration Of Frame- And Label-synchronous Beam Search For Streaming Encoder-decoder Speech Recognition (2023)0.00
- Chunked Attention-based Encoder-decoder Model For Streaming Speech Recognition (2023)7.81
- Streaming Parallel Transducer Beam Search With Fast-slow Cascaded Encoders (2022)0.00