MSR-VTT
Emerging1papers using it
2026first seen
The MSR-VTT dataset is a benchmark that contains video clips paired with descriptive text, used to evaluate multi-modal retrieval tasks.
The MSR-VTT dataset is a benchmark that contains video clips paired with descriptive text, used to evaluate multi-modal retrieval tasks.