Self-supervised Video Representation Learning With Meta-contrastive Network
2021 Β· Yuanze Lin, Xun Guo, Yan Lu
Abstract
Self-supervised learning has been successfully applied to pre-train video representations, which aims at efficient adaptation from pre-training domain to downstream tasks. Existing approaches merely leverage contrastive loss to learn instance-level discrimination. However, lack of category information will lead to hard-positive problem that constrains the generalization ability of this kind of methods. We find that the multi-task process of meta learning can provide a solution to this problem. In this paper, we propose a Meta-Contrastive Network (MCN), which combines the contrastive learning and meta learning, to enhance the learning ability of existing self-supervised approaches. Our method contains two training stages based on model-agnostic meta learning (MAML), each of which consists of a contrastive branch and a meta branch. Extensive evaluations demonstrate the effectiveness of our method. For two downstream tasks, i.e., video action recognition and video retrieval, MCN outperfor
Authors
(none)
Tags
Stats
Related papers
- Cycle-contrast For Self-supervised Video Representation Learning (2020)0.00
- Revisiting Contrastive Methods For Unsupervised Learning Of Visual Representations (2021)3.91
- Self-supervised Video Representation Learning With Cross-stream Prototypical Contrasting (2021)8.82
- Multimodal Clustering Networks For Self-supervised Learning From Unlabeled Videos (2021)13.28
- TCLR: Temporal Contrastive Learning For Video Representation (2021)15.78
- Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework (2020)18.58
- TC-MGC: Text-conditioned Multi-grained Contrastive Learning For Text-video Retrieval (2025)6.93
- Multimodal Contrastive Training For Visual Representation Learning (2021)16.32