Optimizing Contextual Speech Recognition Using Vector Quantization For Efficient Retrieval
2024 Β· Nikolaos Flemotomos, Roger Hsiao, Pawel Swietojanski, et al.
Abstract
Neural contextual biasing allows speech recognition models to leverage contextually relevant information, leading to improved transcription accuracy. However, the biasing mechanism is typically based on a cross-attention module between the audio and a catalogue of biasing entries, which means computational complexity can pose severe practical limitations on the size of the biasing catalogue and consequently on accuracy improvements. This work proposes an approximation to cross-attention scoring based on vector quantization and enables compute- and memory-efficient use of large biasing catalogues. We propose to use this technique jointly with a retrieval based contextual biasing approach. First, we use an efficient quantized retrieval module to shortlist biasing entries by grounding them on audio. Then we use retrieved entries for biasing. Since the proposed approach is agnostic to the biasing method, we investigate using full cross-attention, LLM prompting, and a combination of the two
Authors
(none)
Tags
Stats
Related papers
- Robust Acoustic And Semantic Contextual Biasing In Neural Transducers For Speech Recognition (2023)8.60
- Improving Neural Biasing For Contextual Speech Recognition By Early Context Injection And Text Perturbation (2024)8.09
- Adaptive Contextual Biasing For Transducer Based Streaming Speech Recognition (2023)7.16
- Contextualized End-to-end Automatic Speech Recognition With Intermediate Biasing Loss (2024)5.84
- XCB: An Effective Contextual Biasing Approach To Bias Cross-lingual Phrases In Speech Recognition (2024)0.00
- Locality Enhanced Dynamic Biasing And Sampling Strategies For Contextual ASR (2024)0.00
- Contextualized Streaming End-to-end Speech Recognition With Trie-based Deep Biasing And Shallow Fusion (2021)13.44
- Towards Contextual Spelling Correction For Customization Of End-to-end Speech Recognition Systems (2022)9.92