Learning Invariant Representation And Risk Minimized For Unsupervised Accent Domain Adaptation
2022 Β· Chendong Zhao, Jianzong Wang, Xiaoyang Qu, et al.
Abstract
Unsupervised representation learning for speech audios attained impressive performances for speech recognition tasks, particularly when annotated speech is limited. However, the unsupervised paradigm needs to be carefully designed and little is known about what properties these representations acquire. There is no guarantee that the model learns meaningful representations for valuable information for recognition. Moreover, the adaptation ability of the learned representations to other domains still needs to be estimated. In this work, we explore learning domain-invariant representations via a direct mapping of speech representations to their corresponding high-level linguistic informations. Results prove that the learned latents not only capture the articulatory feature of each phoneme but also enhance the adaptation ability, outperforming the baseline largely on accented benchmarks.
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Domain Adaptation For Robust Speech Recognition Via Variational Autoencoder-based Data Augmentation (2017)14.23
- Extracting Domain Invariant Features By Unsupervised Learning For Robust Automatic Speech Recognition (2018)9.03
- Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-supervised Speech Units (2023)4.52
- Unsupervised Domain Adaptation By Adversarial Learning For Robust Speech Recognition (2018)0.00
- Unsupervised Adaptation With Domain Separation Networks For Robust Speech Recognition (2017)9.92
- An Unsupervised Autoregressive Model For Speech Representation Learning (2019)17.26
- Automatic Data Augmentation For Domain Adapted Fine-tuning Of Self-supervised Speech Representations (2023)0.00
- Adversarial Learning Of Raw Speech Features For Domain Invariant Speech Recognition (2018)9.23