Aligned Contrastive Predictive Coding
2021 · Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, et al.
Abstract
We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations. Rather than producing individual predictions for each of the future representations, the model emits a sequence of predictions shorter than that of the upcoming representations to which they will be aligned. In this way, the prediction network solves a simpler task of predicting the next symbols, but not their exact timing, while the encoding network is trained to produce piece-wise constant latent codes. We evaluate the model on a speech coding task and demonstrate that the proposed Aligned Contrastive Predictive Coding (ACPC) leads to higher linear phone prediction accuracy and lower ABX error rates, while being slightly faster to train due to the reduced number of prediction heads.
Authors
(none)
Tags
Stats
Related papers
- Unsupervised Speech Segmentation And Variable Rate Representation Learning Using Segmental Contrastive Predictive Coding (2021)9.92
- Guided Contrastive Self-supervised Pre-training For Automatic Speech Recognition (2022)0.00
- Contrastive Prediction Strategies For Unsupervised Segmentation And Categorization Of Phonemes And Words (2021)9.23
- Segmental Contrastive Predictive Coding For Unsupervised Word Segmentation (2021)0.00
- Contrastive Separative Coding For Self-supervised Representation Learning (2021)0.00
- Self-supervised Representation Learning With Relative Predictive Coding (2021)0.00
- Scala: Supervised Contrastive Learning For End-to-end Speech Recognition (2021)2.26
- Non-autoregressive Predictive Coding For Learning Speech Representations From Local Dependencies (2020)12.47