Robust Acoustic And Semantic Contextual Biasing In Neural Transducers For Speech Recognition
2023 Β· Xuandi Fu, Kanthashree Mysore Sathyendra, Ankur Gandhe, et al.
Abstract
Attention-based contextual biasing approaches have shown significant improvements in the recognition of generic and/or personal rare-words in End-to-End Automatic Speech Recognition (E2E ASR) systems like neural transducers. These approaches employ cross-attention to bias the model towards specific contextual entities injected as bias-phrases to the model. Prior approaches typically relied on subword encoders for encoding the bias phrases. However, subword tokenizations are coarse and fail to capture granular pronunciation information which is crucial for biasing based on acoustic similarity. In this work, we propose to use lightweight character representations to encode fine-grained pronunciation features to improve contextual biasing guided by acoustic similarity between the audio and the contextual entities (termed acoustic biasing). We further integrate pretrained neural language model (NLM) based encoders to encode the utterance's semantic context along with contextual entities to
Authors
(none)
Tags
Stats
Related papers
- Improving Neural Biasing For Contextual Speech Recognition By Early Context Injection And Text Perturbation (2024)8.09
- Contextualized Automatic Speech Recognition With Attention-based Bias Phrase Boosted Beam Search (2024)8.60
- Adaptive Contextual Biasing For Transducer Based Streaming Speech Recognition (2023)7.16
- Optimizing Contextual Speech Recognition Using Vector Quantization For Efficient Retrieval (2024)2.26
- Contextualized End-to-end Automatic Speech Recognition With Intermediate Biasing Loss (2024)5.84
- Towards Contextual Spelling Correction For Customization Of End-to-end Speech Recognition Systems (2022)9.92
- Contextual Adapters For Personalized Speech Recognition In Neural Transducers (2022)12.47
- Contextualized Streaming End-to-end Speech Recognition With Trie-based Deep Biasing And Shallow Fusion (2021)13.44