DiDeMo
Emerging2papers using it
445HF downloads
11HF likes
2021first seen
About DiDeMo contains 10K long-form videos from Flickr. For each video, ~4 short sentences are annotated in temporal order. We follow the existing works to concatenate those short sentences and evaluate ‘paragraph-to-video’ retrieval on this benchmark. We adopt the official split: Train: 8,395 videos, 8,395 captions (c