Unipet-spk: A Unified Framework For Parameter-efficient Tuning Of Pre-trained Speech Models For Robust Speaker Verification
2025 Β· Mufan Sang, John H. L. Hansen
Abstract
With excellent generalization ability, SSL speech models have shown impressive performance on various downstream tasks in the pre-training and fine-tuning paradigm. However, as the size of pre-trained models grows, fine-tuning becomes practically unfeasible due to expanding computation and storage requirements and the risk of overfitting. This study explores parameter-efficient tuning (PET) methods for adapting large-scale pre-trained SSL speech models to speaker verification task. Correspondingly, we propose three PET methods: (i)an adapter-tuning method, (ii)a prompt-tuning method, and (iii)a unified framework that effectively incorporates adapter-tuning and prompt-tuning with a dynamically learnable gating mechanism. First, we propose the Inner+Inter Adapter framework, which inserts two types of adapters into pre-trained models, allowing for adaptation of latent features within the intermediate Transformer layers and output embeddings from all Transformer layers, through a parallel
Authors
(none)
Tags
Stats
Related papers
- Efficient Adapter Tuning Of Pre-trained Speech Models For Automatic Speaker Verification (2024)0.00
- Parameter-efficient Transfer Learning Of Pre-trained Transformer Models For Speaker Verification Using Adapters (2022)0.00
- Exploring Efficient-tuning Methods In Self-supervised Speech Models (2022)10.07
- Integrated Parameter-efficient Tuning For General-purpose Audio Models (2022)0.00
- Elp-adapters: Parameter Efficient Adapter Tuning For Various Speech Processing Tasks (2024)7.81
- Leveraging Parameter-efficient Transfer Learning For Multi-lingual Text-to-speech Adaptation (2024)0.00
- Efficient Adapter Transfer Of Self-supervised Speech Models For Automatic Speech Recognition (2022)12.68
- CHAPTER: Exploiting Convolutional Neural Network Adapters For Self-supervised Speech Models (2022)7.50