Cycle-contrast For Self-supervised Video Representation Learning
2020 Β· Quan Kong, Wenpeng Wei, Ziwei Deng, et al.
Abstract
We present Cycle-Contrastive Learning (CCL), a novel self-supervised method for learning video representation. Following a nature that there is a belong and inclusion relation of video and its frames, CCL is designed to find correspondences across frames and videos considering the contrastive representation in their domains respectively. It is different from recent approaches that merely learn correspondences across frames or clips. In our method, the frame and video representations are learned from a single network based on an R3D architecture, with a shared non-linear transformation for embedding both frame and video features before the cycle-contrastive loss. We demonstrate that the video representation learned by CCL can be transferred well to downstream tasks of video understanding, outperforming previous methods in nearest neighbour retrieval and action recognition tasks on UCF101, HMDB51 and MMAct.
Authors
(none)
Tags
Stats
Related papers
- TCLR: Temporal Contrastive Learning For Video Representation (2021)15.78
- Self-supervised Video Representation Learning With Meta-contrastive Network (2021)11.85
- Self-supervised Video Representation Learning With Cross-stream Prototypical Contrasting (2021)8.82
- Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework (2020)18.58
- Nearest-neighbor Inter-intra Contrastive Learning From Unlabeled Videos (2023)0.00
- Robust Cross-modal Representation Learning With Progressive Self-distillation (2022)12.33
- Crossclr: Cross-modal Contrastive Learning For Multi-modal Video Representations (2021)15.59
- Contrastive Video-language Learning With Fine-grained Frame Sampling (2022)6.77