Exploiting Semi-supervised Training Through A Dropout Regularization In End-to-end Speech Recognition
2019 Β· Subhadeep Dey, Petr Motlicek, Trung Bui, et al.
Abstract
In this paper, we explore various approaches for semi supervised learning in an end to end automatic speech recognition (ASR) framework. The first step in our approach involves training a seed model on the limited amount of labelled data. Additional unlabelled speech data is employed through a data selection mechanism to obtain the best hypothesized output, further used to retrain the seed model. However, uncertainties of the model may not be well captured with a single hypothesis. As opposed to this technique, we apply a dropout mechanism to capture the uncertainty by obtaining multiple hypothesized text transcripts of an speech recording. We assume that the diversity of automatically generated transcripts for an utterance will implicitly increase the reliability of the model. Finally, the data selection process is also applied on these hypothesized transcripts to reduce the uncertainty. Experiments on freely available TEDLIUM corpus and proprietary Adobe's internal dataset show that
Authors
(none)
Tags
Stats
Related papers
- Improved Regularization Techniques For End-to-end Speech Recognition (2017)0.00
- Dropout Regularization For Self-supervised Learning Of Transformer Encoder Speech Representation (2021)4.52
- Unsupervised Domain Adaptation For Speech Recognition Via Uncertainty Driven Self-training (2020)12.25
- Semi-supervised Sequence-to-sequence ASR Using Unpaired Speech And Text (2019)0.00
- End-to-end ASR: From Supervised To Semi-supervised Learning With Modern Architectures (2019)0.00
- End-to-end Rich Transcription-style Automatic Speech Recognition With Semi-supervised Learning (2021)4.52
- Unsupervised Fine-tuning Data Selection For ASR Using Self-supervised Speech Models (2022)5.84
- Improving Noisy Student Training For Low-resource Languages In End-to-end ASR Using Cyclegan And Inter-domain Losses (2024)0.00