Continual Contrastive Spoken Language Understanding
2023 Β· Umberto Cappellazzo, Enrico Fini, Muqiao Yang, et al.
Abstract
Recently, neural networks have shown impressive progress across diverse fields, with speech processing being no exception. However, recent breakthroughs in this area require extensive offline training using large datasets and tremendous computing resources. Unfortunately, these models struggle to retain their previously acquired knowledge when learning new tasks continually, and retraining from scratch is almost always impractical. In this paper, we investigate the problem of learning sequence-to-sequence models for spoken language understanding in a class-incremental learning (CIL) setting and we propose COCONUT, a CIL method that relies on the combination of experience replay and contrastive learning. Through a modified version of the standard supervised contrastive loss applied only to the rehearsal samples, COCONUT preserves the learned representations by pulling closer samples from the same class and pushing away the others. Moreover, we leverage a multimodal contrastive loss that
Authors
(none)
Tags
Stats
Related papers
- Sequence-level Knowledge Distillation For Class-incremental End-to-end Spoken Language Understanding (2023)0.00
- Sequential Contrastive Audio-visual Learning (2024)5.84
- Learning Speech Representation From Contrastive Token-acoustic Pretraining (2023)7.81
- HC\(^2\)L: Hybrid And Cooperative Contrastive Learning For Cross-lingual Spoken Language Understanding (2024)4.52
- Cstnet: Contrastive Speech Translation Network For Self-supervised Speech Representation Learning (2020)0.00
- Gl-clef: A Global-local Contrastive Learning Framework For Cross-lingual Spoken Language Understanding (2022)10.35
- Contrastive Learning For Improving ASR Robustness In Spoken Language Understanding (2022)6.34
- Towards Robust Few-shot Class Incremental Learning In Audio Classification Using Contrastive Representation (2024)4.52