Word-level Embeddings For Cross-task Transfer Learning In Speech Processing
2019 Β· Pierre Beckmann, Mikolaj Kegler, Milos Cernak
Abstract
Recent breakthroughs in deep learning often rely on representation learning and knowledge transfer. In recent years, unsupervised and self-supervised techniques for learning speech representation were developed to foster automatic speech recognition. Up to date, most of these approaches are task-specific and designed for within-task transfer learning between different datasets or setups of a particular task. In turn, learning task-independent representation of speech and cross-task applications of transfer learning remain less common. Here, we introduce an encoder capturing word-level representations of speech for cross-task transfer learning. We demonstrate the application of the pre-trained encoder in four distinct speech and audio processing tasks: (i) speech enhancement, (ii) language identification, (iii) speech, noise, and music classification, and (iv) speaker identification. In each task, we compare the performance of our cross-task transfer learning approach to task-specific b
Authors
(none)
Tags
Stats
Related papers
- Learning Problem-agnostic Speech Representations From Multiple Self-supervised Tasks (2019)15.54
- On The Transferability Of Whisper-based Representations For "in-the-wild" Cross-task Downstream Speech Applications (2023)0.00
- Towards Learning A Universal Non-semantic Representation Of Speech (2020)14.43
- Progressive Neural Networks For Transfer Learning In Emotion Recognition (2017)14.19
- Self-supervised Rewiring Of Pre-trained Speech Encoders: Towards Faster Fine-tuning With Less Labels In Speech Processing (2022)3.58
- Pretrained Semantic Speech Embeddings For End-to-end Spoken Language Understanding Via Cross-modal Teacher-student Learning (2020)9.92
- Supervised Acoustic Embeddings And Their Transferability Across Languages (2023)0.00
- Multi-task Voice Activated Framework Using Self-supervised Learning (2021)6.34