Bayesspeech: A Bayesian Transformer Network For Automatic Speech Recognition
2023 Β· Will Rieger
Abstract
Recent developments using End-to-End Deep Learning models have been shown to have near or better performance than state of the art Recurrent Neural Networks (RNNs) on Automatic Speech Recognition tasks. These models tend to be lighter weight and require less training time than traditional RNN-based approaches. However, these models take frequentist approach to weight training. In theory, network weights are drawn from a latent, intractable probability distribution. We introduce BayesSpeech for end-to-end Automatic Speech Recognition. BayesSpeech is a Bayesian Transformer Network where these intractable posteriors are learned through variational inference and the local reparameterization trick without recurrence. We show how the introduction of variance in the weights leads to faster training time and near state-of-the-art performance on LibriSpeech-960.
Authors
(none)
Tags
Stats
Related papers
- Multitask Learning And Joint Optimization For Transformer-rnn-transducer Speech Recognition (2020)8.09
- Bayesian Learning Of LF-MMI Trained Time Delay Neural Networks For Speech Recognition (2020)8.82
- Transformer-transducer: End-to-end Speech Recognition With Self-attention (2019)0.00
- Lightweight And Efficient End-to-end Speech Recognition Using Low-rank Transformer (2019)0.00
- A Comparative Study On Transformer Vs RNN In Speech Applications (2019)20.07
- Fast Offline Transformer-based End-to-end Automatic Speech Recognition For Real-world Applications (2021)7.16
- Transformer Transducer: A Streamable Speech Recognition Model With Transformer Encoders And RNN-T Loss (2020)18.58
- Minimum Bayes Risk Training Of Rnn-transducer For End-to-end Speech Recognition (2019)0.00