A Simple Baseline For Domain Adaptation In End To End ASR Systems Using Synthetic Data
2022 Β· Raviraj Joshi, Anupam Singh
Abstract
Automatic Speech Recognition(ASR) has been dominated by deep learning-based end-to-end speech recognition models. These approaches require large amounts of labeled data in the form of audio-text pairs. Moreover, these models are more susceptible to domain shift as compared to traditional models. It is common practice to train generic ASR models and then adapt them to target domains using comparatively smaller data sets. We consider a more extreme case of domain adaptation where text-only corpus is available. In this work, we propose a simple baseline technique for domain adaptation in end-to-end speech recognition models. We convert the text-only corpus to audio data using single speaker Text to Speech (TTS) engine. The parallel data in the target domain is then used to fine-tune the final dense layer of generic ASR models. We show that single speaker synthetic TTS data coupled with final dense layer only fine-tuning provides reasonable improvements in word error rates. We use text dat
Authors
(none)
Tags
Stats
Related papers
- A Domain Adaptation Framework For Speech Recognition Systems With Only Synthetic Data (2025)5.24
- Exploring Machine Speech Chain For Domain Adaptation And Few-shot Speaker Adaptation (2021)0.00
- Generating Synthetic Audio Data For Attention-based Speech Recognition Systems (2019)12.68
- Enhancing Synthetic Training Data For Speech Commands: From Asr-based Filtering To Domain Adaptation In SSL Latent Space (2024)0.00
- Rapid Speaker Adaptation In Low Resource Text To Speech Systems Using Synthetic Data And Transfer Learning (2023)0.00
- Text-only Domain Adaptation For End-to-end Speech Recognition Through Down-sampling Acoustic Representation (2023)0.00
- Text-only Domain Adaptation Using Unified Speech-text Representation In Transducer (2023)4.52
- On The Effect Of Purely Synthetic Training Data For Different Automatic Speech Recognition Architectures (2024)0.00