GOCA: Guided Online Cluster Assignment For Self-supervised Video Representation Learning
2022 Β· Huseyin Coskun, Alireza Zareian, Joshua L. Moore, et al.
Abstract
Clustering is a ubiquitous tool in unsupervised learning. Most of the existing self-supervised representation learning methods typically cluster samples based on visually dominant features. While this works well for image-based self-supervision, it often fails for videos, which require understanding motion rather than focusing on background. Using optical flow as complementary information to RGB can alleviate this problem. However, we observe that a naive combination of the two views does not provide meaningful gains. In this paper, we propose a principled way to combine two views. Specifically, we propose a novel clustering strategy where we use the initial cluster assignment of each view as prior to guide the final cluster assignment of the other view. This idea will enforce similar cluster structures for both views, and the formed clusters will be semantically abstract and robust to noisy inputs coming from each individual view. Additionally, we propose a novel regularization strate
Authors
(none)
Tags
Stats
Related papers
- Multimodal Clustering Networks For Self-supervised Learning From Unlabeled Videos (2021)13.28
- Self-supervised Video Representation Learning With Cross-stream Prototypical Contrasting (2021)8.82
- Representation Learning Via Consistent Assignment Of Views Over Random Partitions (2023)0.00
- Graph-collaborated Auto-encoder Hashing For Multi-view Binary Clustering (2023)14.31
- Joint Representation Learning And Novel Category Discovery On Single- And Multi-modal Data (2021)13.11
- Cycle-contrast For Self-supervised Video Representation Learning (2020)0.00
- Unsupervised High-level Feature Learning By Ensemble Projection For Semi-supervised Image Classification And Image Clustering (2016)0.00
- Robust Character Labeling In Movie Videos: Data Resources And Self-supervised Feature Adaptation (2020)6.34