Nearest-neighbor Inter-intra Contrastive Learning From Unlabeled Videos
2023 Β· David Fan, Deyu Yang, Xinyu Li, et al.
Abstract
Contrastive learning has recently narrowed the gap between self-supervised and supervised methods in image and video domain. State-of-the-art video contrastive learning methods such as CVRL and \(\rho\)-MoCo spatiotemporally augment two clips from the same video as positives. By only sampling positive clips locally from a single video, these methods neglect other semantically related videos that can also be useful. To address this limitation, we leverage nearest-neighbor videos from the global space as additional positive pairs, thus improving positive key diversity and introducing a more relaxed notion of similarity that extends beyond video and even class boundaries. Our method, Inter-Intra Video Contrastive Learning (IIVCL), improves performance on a range of video tasks.
Authors
(none)
Tags
Stats
Related papers
- Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework (2020)18.58
- With A Little Help From My Friends: Nearest-neighbor Contrastive Learning Of Visual Representations (2021)18.76
- TCLR: Temporal Contrastive Learning For Video Representation (2021)15.78
- Cycle-contrast For Self-supervised Video Representation Learning (2020)0.00
- Crossclr: Cross-modal Contrastive Learning For Multi-modal Video Representations (2021)15.59
- Normalized Contrastive Learning For Text-video Retrieval (2022)6.77
- Video Corpus Moment Retrieval With Contrastive Learning (2021)14.35
- X-CLIP: End-to-end Multi-grained Contrastive Learning For Video-text Retrieval (2022)18.12