Synthetic Cross-accent Data Augmentation For Automatic Speech Recognition
2023 Β· Philipp Klumpp, Pooja Chitkara, Leda SarΔ±, et al.
Abstract
The awareness for biased ASR datasets or models has increased notably in recent years. Even for English, despite a vast amount of available training data, systems perform worse for non-native speakers. In this work, we improve an accent-conversion model (ACM) which transforms native US-English speech into accented pronunciation. We include phonetic knowledge in the ACM training to provide accurate feedback about how well certain pronunciation patterns were recovered in the synthesized waveform. Furthermore, we investigate the feasibility of learned accent representations instead of static embeddings. Generated data was then used to train two state-of-the-art ASR systems. We evaluated our approach on native and non-native English datasets and found that synthetically accented data helped the ASR to better understand speech from seen accents. This observation did not translate to unseen accents, and it was not observed for a model that had been pre-trained exclusively with native speech.
Authors
(none)
Tags
Stats
Related papers
- Accent Conversion Using Discrete Units With Parallel Data Synthesized From Controllable Accented TTS (2024)0.00
- Improving Accented Speech Recognition Using Data Augmentation Based On Unsupervised Text-to-speech Synthesis (2024)0.00
- ASR Data Augmentation In Low-resource Settings Using Cross-lingual Multi-speaker TTS And Cross-lingual Voice Conversion (2022)6.77
- Improving Accent Conversion With Reference Encoder And End-to-end Text-to-speech (2020)0.00
- Exploring Data Augmentation In Bias Mitigation Against Non-native-accented Speech (2023)8.09
- Zero Shot Text To Speech Augmentation For Automatic Speech Recognition On Low-resource Accented Speech Corpora (2024)2.26
- Accent-robust Automatic Speech Recognition Using Supervised And Unsupervised Wav2vec Embeddings (2021)0.00
- Accent Conversion In Text-to-speech Using Multi-level VAE And Adversarial Training (2024)5.84