CA-SSLR: Condition-aware Self-supervised Learning Representation For Generalized Speech Processing
2024 Β· Yen-Ju Lu, Jing Liu, Thomas Thebaud, et al.
Abstract
We introduce Condition-Aware Self-Supervised Learning Representation (CA-SSLR), a generalist conditioning model broadly applicable to various speech-processing tasks. Compared to standard fine-tuning methods that optimize for downstream models, CA-SSLR integrates language and speaker embeddings from earlier layers, making the SSL model aware of the current language and speaker context. This approach reduces the reliance on input audio features while preserving the integrity of the base SSLR. CA-SSLR improves the model's capabilities and demonstrates its generality on unseen tasks with minimal task-specific tuning. Our method employs linear modulation to dynamically adjust internal representations, enabling fine-grained adaptability without significantly altering the original model behavior. Experiments show that CA-SSLR reduces the number of trainable parameters, mitigates overfitting, and excels in under-resourced and unseen tasks. Specifically, CA-SSLR achieves a 10% relative reducti
Authors
(none)
Tags
Stats
Related papers
- An Adapter Based Pre-training For Efficient And Scalable Self-supervised Speech Representation Learning (2021)8.35
- Automatic Pronunciation Assessment Using Self-supervised Speech Representation Learning (2022)0.00
- Downstream Task Agnostic Speech Enhancement With Self-supervised Representation Loss (2023)6.77
- Multi-variant Consistency Based Self-supervised Learning For Robust Automatic Speech Recognition (2021)0.00
- Fusion Of Discrete Representations And Self-augmented Representations For Multilingual Automatic Speech Recognition (2024)2.26
- CA-MHFA: A Context-aware Multi-head Factorized Attentive Pooling For Ssl-based Speaker Verification (2024)6.34
- Unispeech-sat: Universal Speech Representation Learning With Speaker Aware Pre-training (2021)0.00
- Combining Spectral And Self-supervised Features For Low Resource Speech Recognition And Translation (2022)8.82