Bi-apc: Bidirectional Autoregressive Predictive Coding For Unsupervised Pre-training And Its Application To Children's ASR
2021 Β· Ruchao Fan, Amber Afshan, Abeer Alwan
Abstract
We present a bidirectional unsupervised model pre-training (UPT) method and apply it to children's automatic speech recognition (ASR). An obstacle to improving child ASR is the scarcity of child speech databases. A common approach to alleviate this problem is model pre-training using data from adult speech. Pre-training can be done using supervised (SPT) or unsupervised methods, depending on the availability of annotations. Typically, SPT performs better. In this paper, we focus on UPT to address the situations when pre-training data are unlabeled. Autoregressive predictive coding (APC), a UPT method, predicts frames from only one direction, limiting its use to uni-directional pre-training. Conventional bidirectional UPT methods, however, predict only a small portion of frames. To extend the benefits of APC to bi-directional pre-training, Bi-APC is proposed. We then use adaptation techniques to transfer knowledge learned from adult speech (using the Librispeech corpus) to child speech
Authors
(none)
Tags
Stats
Related papers
- Generative Pre-training For Speech With Autoregressive Predictive Coding (2019)14.73
- Low Resource German ASR With Untranscribed Data Spoken By Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL System (2021)3.58
- Guided Contrastive Self-supervised Pre-training For Automatic Speech Recognition (2022)0.00
- Unsupervised Pre-training Of Bidirectional Speech Encoders Via Masked Reconstruction (2020)12.33
- LPC Augment: An Lpc-based ASR Data Augmentation Algorithm For Low And Zero-resource Children's Dialects (2022)7.81
- The Effectiveness Of Unsupervised Subword Modeling With Autoregressive And Cross-lingual Phone-aware Networks (2020)2.26
- REBORN: Reinforcement-learned Boundary Segmentation With Iterative Training For Unsupervised ASR (2024)2.26
- Improved Speech Representations With Multi-target Autoregressive Predictive Coding (2020)10.97