Self-consistent Context Aware Conformer Transducer For Speech Recognition
2024 Β· Konstantin Kolokolov, Pavel Pekichev, Karthik Raghunathan
Abstract
We introduce a novel neural network module that adeptly handles recursive data flow in neural network architectures. At its core, this module employs a self-consistent approach where a set of recursive equations is solved iteratively, halting when the difference between two consecutive iterations falls below a defined threshold. Leveraging this mechanism, we construct a new neural network architecture, an extension of the conformer transducer, which enriches automatic speech recognition systems with a stream of contextual information. Our method notably improves the accuracy of recognizing rare words without adversely affecting the word error rate for common vocabulary. We investigate the improvement in accuracy for these uncommon words using our novel model, both independently and in conjunction with shallow fusion with a context language model. Our findings reveal that the combination of both approaches can improve the accuracy of detecting rare words by as much as 4.5 times. Our pro
Authors
(none)
Tags
Stats
Related papers
- Towards Effective And Compact Contextual Representation For Conformer Transducer Speech Recognition Systems (2023)7.16
- Contextnet: Improving Convolutional Neural Networks For Automatic Speech Recognition With Global Context (2020)17.24
- Generalizing Rnn-transducer To Out-domain Audio Via Sparse Self-attention Layers (2021)6.34
- Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers (2021)10.07
- Constrained Convolutional-recurrent Networks To Improve Speech Quality With Low Impact On Recognition Accuracy (2018)5.24
- Contextual Adapters For Personalized Speech Recognition In Neural Transducers (2022)12.47
- Fast Contextual Adaptation With Neural Associative Memory For On-device Personalized Speech Recognition (2021)9.76
- Fast Conformer With Linearly Scalable Attention For Efficient Speech Recognition (2023)14.47