← all datasets

DiDeMo

Emerging
2papers using it
445HF downloads
11HF likes
2021first seen

About DiDeMo contains 10K long-form videos from Flickr. For each video, ~4 short sentences are annotated in temporal order. We follow the existing works to concatenate those short sentences and evaluate ‘paragraph-to-video’ retrieval on this benchmark. We adopt the official split: Train: 8,395 videos, 8,395 captions (c

Papers using DiDeMo (2)

DiDeMo — datasets — learning-to-hash