Learning Problem-agnostic Speech Representations From Multiple Self-supervised Tasks
2019 Β· Santiago Pascual, Mirco Ravanelli, Joan SerrΓ , et al.
Abstract
Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure. Some recent works, however, have shown that it is possible to derive useful speech representations by employing a self-supervised encoder-discriminator approach. This paper proposes an improved self-supervised method, where a single neural encoder is followed by multiple workers that jointly solve different self-supervised tasks. The needed consensus across different tasks naturally imposes meaningful constraints to the encoder, contributing to discover general representations and to minimize the risk of learning superficial ones. Experiments show that the proposed approach can learn transferable, robust, and problem-agnostic features that carry on relevant information from the speech signal, such as speaker identity, phonemes, and even higher-level feature
Authors
(none)
Tags
Stats
Related papers
- Similarity Analysis Of Self-supervised Speech Representations (2020)10.07
- Simultaneous Or Sequential Training? How Speech Representations Cooperate In A Multi-task Self-supervised Learning System (2023)3.58
- Pretext Tasks Selection For Multitask Self-supervised Speech Representation Learning (2021)8.60
- Universal Paralinguistic Speech Representations Using Self-supervised Conformers (2021)10.48
- Word-level Embeddings For Cross-task Transfer Learning In Speech Processing (2019)5.24
- Learning Speech Representations From Raw Audio By Joint Audiovisual Self-supervision (2020)0.00
- Contrastive Separative Coding For Self-supervised Representation Learning (2021)0.00
- General-purpose Speech Representation Learning Through A Self-supervised Multi-granularity Framework (2021)0.00