Robust Character Labeling In Movie Videos: Data Resources And Self-supervised Feature Adaptation
2020 Β· Krishna Somandepalli, Rajat Hebbar, Shrikanth Narayanan
Abstract
Robust face clustering is a vital step in enabling computational understanding of visual character portrayal in media. Face clustering for long-form content is challenging because of variations in appearance and lack of supporting large-scale labeled data. Our work in this paper focuses on two key aspects of this problem: the lack of domain-specific training or benchmark datasets, and adapting face embeddings learned on web images to long-form content, specifically movies. First, we present a dataset of over 169,000 face tracks curated from 240 Hollywood movies with weak labels on whether a pair of face tracks belong to the same or a different character. We propose an offline algorithm based on nearest-neighbor search in the embedding space to mine hard-examples from these tracks. We then investigate triplet-loss and multiview correlation-based methods for adapting face embeddings to hard-examples. Our experimental results highlight the usefulness of weakly labeled data for domain-spec
Authors
(none)
Tags
Stats
Related papers
- Dual-triplet Metric Learning For Unsupervised Domain Adaptation In Video-based Face Recognition (2020)5.84
- Multimodal Clustering Networks For Self-supervised Learning From Unlabeled Videos (2021)13.28
- GOCA: Guided Online Cluster Assignment For Self-supervised Video Representation Learning (2022)5.24
- Robust Cross-modal Representation Learning With Progressive Self-distillation (2022)12.33
- Learnable Pins: Cross-modal Embeddings For Person Identity (2018)15.22
- Learning Local Descriptors By Optimizing The Keypoint-correspondence Criterion: Applications To Face Matching, Learning From Unlabeled Videos And 3d-shape Retrieval (2016)11.75
- Learning Robust Visual-semantic Embeddings (2017)15.22
- Domain Adaptation In Multi-view Embedding For Cross-modal Video Retrieval (2021)0.00