Benchmarking Children's ASR With Supervised And Self-supervised Speech Foundation Models
2024 Β· Ruchao Fan, Natarajan Balaji Shankar, Abeer Alwan
Abstract
Speech foundation models (SFMs) have achieved state-of-the-art results for various speech tasks in supervised (e.g. Whisper) or self-supervised systems (e.g. WavLM). However, the performance of SFMs for child ASR has not been systematically studied. In addition, there is no benchmark for child ASR with standard evaluations, making the comparisons of novel ideas difficult. In this paper, we initiate and present a comprehensive benchmark on several child speech databases based on various SFMs (Whisper, Wav2vec2.0, HuBERT, and WavLM). Moreover, we investigate finetuning strategies by comparing various data augmentation and parameter-efficient finetuning (PEFT) methods. We observe that the behaviors of these methods are different when the model size increases. For example, PEFT matches the performance of full finetuning for large models but worse for small models. To stabilize finetuning using augmented data, we propose a perturbation invariant finetuning (PIF) loss as a regularization.
Authors
(none)
Tags
Stats
Related papers
- A Comparative Analysis Between Conformer-transducer, Whisper, And Wav2vec2 For Improving The Child Speech Recognition (2023)7.16
- Improving Child Speech Recognition With Augmented Child-like Speech (2024)5.24
- Investigating Zero-shot Generalizability On Mandarin-english Code-switched ASR And Speech-to-text Translation Of Recent Foundation Models With Self-supervision And Weak Supervision (2023)0.00
- Fine-tuning Strategies For Faster Inference Using Speech Self-supervised Models: A Comparative Study (2023)8.35
- Unsupervised Fine-tuning Data Selection For ASR Using Self-supervised Speech Models (2022)5.84
- Resource-efficient Adaptation Of Speech Foundation Models For Multi-speaker ASR (2024)3.58
- Examining Test-time Adaptation For Personalized Child Speech Recognition (2024)0.00
- Speech Self-supervised Representations Benchmarking: A Case For Larger Probing Heads (2023)2.26