Simultaneous Or Sequential Training? How Speech Representations Cooperate In A Multi-task Self-supervised Learning System
2023 · Khazar Khorrami, María Andrea Cruz Blandón, Tuomas Virtanen, et al.
Abstract
Speech representation learning with self-supervised algorithms has resulted in notable performance boosts in many downstream tasks. Recent work combined self-supervised learning (SSL) and visually grounded speech (VGS) processing mechanisms for representation learning. The joint training with SSL and VGS mechanisms provides the opportunity to utilize both unlabeled speech and speech-related visual information based on data availability. This has shown to enhance the quality of learned representations, especially at encoding semantic- and lexical-level knowledge. In this work, we further study the joint optimization of wav2vec 2.0-based SSL and transformer-based VGS as a multi-task learning system. We explore a set of training scenarios to understand how speech representations are shared or transferred between the two tasks, and what is the optimal training strategy for cross-modal semantic retrieval and phoneme discrimination performance. As a result, we find that sequential training w
Authors
(none)
Tags
Stats
Related papers
- Efficient Infusion Of Self-supervised Representations In Automatic Speech Recognition (2024)0.00
- Learning Problem-agnostic Speech Representations From Multiple Self-supervised Tasks (2019)15.54
- Fusion Of Discrete Representations And Self-augmented Representations For Multilingual Automatic Speech Recognition (2024)2.26
- An Adapter Based Pre-training For Efficient And Scalable Self-supervised Speech Representation Learning (2021)8.35
- Exploring Effective Fusion Algorithms For Speech Based Self-supervised Learning Models (2022)0.00
- Speech Representation Analysis Based On Inter- And Intra-model Similarities (2024)2.26
- Multi-task Voice Activated Framework Using Self-supervised Learning (2021)6.34
- Unispeech-sat: Universal Speech Representation Learning With Speaker Aware Pre-training (2021)0.00