CHAPTER: Exploiting Convolutional Neural Network Adapters For Self-supervised Speech Models
2022 Β· Zih-Ching Chen, Yu-Shun Sung, Hung-Yi Lee
Abstract
Self-supervised learning (SSL) is a powerful technique for learning representations from unlabeled data. Transformer based models such as HuBERT, which consist a feature extractor and transformer layers, are leading the field in the speech domain. SSL models are fine-tuned on a wide range of downstream tasks, which involves re-training the majority of the model for each task. Previous studies have introduced applying adapters, which are small lightweight modules commonly used in Natural Language Processing (NLP) to adapt pre-trained models to new tasks. However, such efficient tuning techniques only provide adaptation at the transformer layer, but failed to perform adaptation at the feature extractor. In this paper, we propose CHAPTER, an efficient tuning method specifically designed for SSL speech model, by applying CNN adapters at the feature extractor. Using this method, we can only fine-tune fewer than 5% of parameters per task compared to fully fine-tuning and achieve better and m
Authors
(none)
Tags
Stats
Related papers
- Efficient Adapter Transfer Of Self-supervised Speech Models For Automatic Speech Recognition (2022)12.68
- Exploring Efficient-tuning Methods In Self-supervised Speech Models (2022)10.07
- Front-end Adapter: Adapting Front-end Input Of Speech Based Self-supervised Learning For Speech Recognition (2023)0.00
- Efficient Adapter Tuning Of Pre-trained Speech Models For Automatic Speaker Verification (2024)0.00
- How To Learn A New Language? An Efficient Solution For Self-supervised Learning Models Unseen Languages Adaption In Low-resource Scenario (2024)0.00
- An Adapter Based Pre-training For Efficient And Scalable Self-supervised Speech Representation Learning (2021)8.35
- An Adapter Based Multi-label Pre-training For Speech Separation And Enhancement (2022)7.50
- Recycle-and-distill: Universal Compression Strategy For Transformer-based Speech SSL Models With Attention Map Reusing And Masking Distillation (2023)5.84