An Online Sequence-to-sequence Model For Noisy Speech Recognition
2017 Β· Chung-Cheng Chiu, Dieterich Lawson, Yuping Luo, et al.
Abstract
Generative models have long been the dominant approach for speech recognition. The success of these models however relies on the use of sophisticated recipes and complicated machinery that is not easily accessible to non-practitioners. Recent innovations in Deep Learning have given rise to an alternative - discriminative models called Sequence-to-Sequence models, that can almost match the accuracy of state of the art generative models. While these models are easy to train as they can be trained end-to-end in a single step, they have a practical limitation that they can only be used for offline recognition. This is because the models require that the entirety of the input sequence be available at the beginning of inference, an assumption that is not valid for instantaneous speech recognition. To address this problem, online sequence-to-sequence models were recently introduced. These models are able to start producing outputs as data arrives, and the model feels confident enough to outpu
Authors
(none)
Tags
Stats
Related papers
- High Performance Sequence-to-sequence Model For Streaming Speech Recognition (2020)3.58
- Robust Speech Recognition Using Generative Adversarial Networks (2017)11.29
- Towards Better Decoding And Language Model Integration In Sequence To Sequence Models (2016)15.67
- Learning Online Alignments With Continuous Rewards Policy Gradient (2016)8.60
- State-of-the-art Speech Recognition With Sequence-to-sequence Models (2017)21.01
- On Using 2D Sequence-to-sequence Models For Speech Recognition (2019)0.00
- Low-latency Speech Enhancement Via Speech Token Generation (2023)7.50
- Unsupervised Speech Enhancement With Deep Dynamical Generative Speech And Noise Models (2023)0.00