SE/BN Adapter: Parametric Efficient Domain Adaptation For Speaker Recognition
2024 Β· Tianhao Wang, Lantian Li, Dong Wang
Abstract
Deploying a well-optimized pre-trained speaker recognition model in a new domain often leads to a significant decline in performance. While fine-tuning is a commonly employed solution, it demands ample adaptation data and suffers from parameter inefficiency, rendering it impractical for real-world applications with limited data available for model adaptation. Drawing inspiration from the success of adapters in self-supervised pre-trained models, this paper introduces a SE/BN adapter to address this challenge. By freezing the core speaker encoder and adjusting the feature maps' weights and activation distributions, we introduce a novel adapter utilizing trainable squeeze-and-excitation (SE) blocks and batch normalization (BN) layers, termed SE/BN adapter. Our experiments, conducted using VoxCeleb for pre-training and 4 genres from CN-Celeb for adaptation, demonstrate that the SE/BN adapter offers significant performance improvement over the baseline and competes with the vanilla fine-tu
Authors
(none)
Tags
Stats
Related papers
- Efficient Adapter Tuning Of Pre-trained Speech Models For Automatic Speaker Verification (2024)0.00
- Multi-domain Adaptation By Self-supervised Learning For Speaker Verification (2023)0.00
- Efficient Black-box Speaker Verification Model Adaptation With Reprogramming And Backend Learning (2023)0.00
- Vae-based Domain Adaptation For Speaker Verification (2019)7.50
- DEAAN: Disentangled Embedding And Adversarial Adaptation Network For Robust Speaker Representation Learning (2020)9.59
- Adapting End-to-end Neural Speaker Verification To New Languages And Recording Conditions With Adversarial Training (2018)9.59
- Elp-adapters: Parameter Efficient Adapter Tuning For Various Speech Processing Tasks (2024)7.81
- Adapter-based Extension Of Multi-speaker Text-to-speech Model For New Speakers (2022)6.77