DiDeMo

Emerging

2papers using it

445HF downloads

11HF likes

2021first seen

About DiDeMo contains 10K long-form videos from Flickr. For each video, ~4 short sentences are annotated in temporal order. We follow the existing works to concatenate those short sentences and evaluate ‘paragraph-to-video’ retrieval on this benchmark. We adopt the official split: Train: 8,395 videos, 8,395 captions (c

🤗 Hugging Face

Papers using DiDeMo (2)

Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval2021

T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval2024