Transfer Learning For Speech Recognition On A Budget
2017 Β· Julius Kunze, Louis Kirsch, Ilia Kurenkov, et al.
Abstract
End-to-end training of automated speech recognition (ASR) systems requires massive data and compute resources. We explore transfer learning based on model adaptation as an approach for training ASR models under constrained GPU memory, throughput and training data. We conduct several systematic experiments adapting a Wav2Letter convolutional neural network originally trained for English ASR to the German language. We show that this technique allows faster training on consumer-grade resources while requiring less training data in order to achieve the same accuracy, thereby lowering the cost of training ASR models in other languages. Model introspection revealed that small adaptations to the network's weights were sufficient for good performance, especially for inner layers.
Authors
(none)
Tags
Stats
Related papers
- Neural Transducer Training: Reduced Memory Consumption With Sample-wise Computation (2022)0.00
- Efficient Adapter Transfer Of Self-supervised Speech Models For Automatic Speech Recognition (2022)12.68
- Knowledge Transfer From Large-scale Pretrained Language Models To End-to-end Speech Recognizers (2022)9.41
- Bootstrap An End-to-end ASR System By Multilingual Training, Transfer Learning, Text-to-text Mapping And Synthetic Audio (2020)5.24
- Gated Low-rank Adaptation For Personalized Code-switching Automatic Speech Recognition On The Low-spec Devices (2024)0.00
- Litevsr: Efficient Visual Speech Recognition By Learning From Speech Representations Of Unlabeled Data (2023)5.84
- Incremental Layer-wise Self-supervised Learning For Efficient Speech Domain Adaptation On Device (2021)5.84
- Generative Adversarial Training Data Adaptation For Very Low-resource Automatic Speech Recognition (2020)6.77