Boosting Cross-domain Speech Recognition With Self-supervision
2022 Β· Han Zhu, Gaofeng Cheng, Jindong Wang, et al.
Abstract
The cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to the mismatch between training and testing distributions. Since the target domain usually lacks labeled data, and domain shifts exist at acoustic and linguistic levels, it is challenging to perform unsupervised domain adaptation (UDA) for ASR. Previous work has shown that self-supervised learning (SSL) or pseudo-labeling (PL) is effective in UDA by exploiting the self-supervisions of unlabeled data. However, these self-supervisions also face performance degradation in mismatched domain distributions, which previous work fails to address. This work presents a systematic UDA framework to fully utilize the unlabeled data with self-supervision in the pre-training and fine-tuning paradigm. On the one hand, we apply continued pre-training and data replay techniques to mitigate the domain mismatch of the SSL pre-trained model. On the other hand, we propose a domain-adaptive fine-tuning approach
Authors
(none)
Tags
Stats
Related papers
- Automatic Data Augmentation For Domain Adapted Fine-tuning Of Self-supervised Speech Representations (2023)0.00
- Self-supervised Learning Based Domain Adaptation For Robust Speaker Verification (2021)11.49
- Deploying Self-supervised Learning In The Wild For Hybrid Automatic Speech Recognition (2022)0.00
- Unsupervised Domain Adaptation For Speech Recognition Via Uncertainty Driven Self-training (2020)12.25
- Analyzing The Factors Affecting Usefulness Of Self-supervised Pre-trained Representations For Speech Recognition (2022)0.00
- DRAFT: A Novel Framework To Reduce Domain Shifting In Self-supervised Learning And Its Application To Children's ASR (2022)10.48
- Ac-mix: Self-supervised Adaptation For Low-resource Automatic Speech Recognition Using Agnostic Contrastive Mixup (2024)2.26
- PADA: Pruning Assisted Domain Adaptation For Self-supervised Speech Representations (2022)5.24