Exploring Efficient-tuning Methods In Self-supervised Speech Models
2022 Β· Zih-Ching Chen, Chin-Lun Fu, Chih-Ying Liu, et al.
Abstract
In this study, we aim to explore efficient tuning methods for speech self-supervised learning. Recent studies show that self-supervised learning (SSL) can learn powerful representations for different speech tasks. However, fine-tuning pre-trained models for each downstream task is parameter-inefficient since SSL models are notoriously large with millions of parameters. Adapters are lightweight modules commonly used in NLP to solve this problem. In downstream tasks, the parameters of SSL models are frozen, and only the adapters are trained. Given the lack of studies generally exploring the effectiveness of adapters for self-supervised speech tasks, we intend to fill this gap by adding various adapter modules in pre-trained speech SSL models. We show that the performance parity can be achieved with over 90% parameter reduction, and discussed the pros and cons of efficient tuning techniques. This is the first comprehensive investigation of various adapter types across speech tasks.
Authors
(none)
Tags
Stats
Related papers
- CHAPTER: Exploiting Convolutional Neural Network Adapters For Self-supervised Speech Models (2022)7.50
- Efficient Adapter Transfer Of Self-supervised Speech Models For Automatic Speech Recognition (2022)12.68
- Elp-adapters: Parameter Efficient Adapter Tuning For Various Speech Processing Tasks (2024)7.81
- Efficient Adapter Tuning Of Pre-trained Speech Models For Automatic Speaker Verification (2024)0.00
- Unipet-spk: A Unified Framework For Parameter-efficient Tuning Of Pre-trained Speech Models For Robust Speaker Verification (2025)4.52
- How To Learn A New Language? An Efficient Solution For Self-supervised Learning Models Unseen Languages Adaption In Low-resource Scenario (2024)0.00
- Fine-tuning Strategies For Faster Inference Using Speech Self-supervised Models: A Comparative Study (2023)8.35
- Front-end Adapter: Adapting Front-end Input Of Speech Based Self-supervised Learning For Speech Recognition (2023)0.00