Exploiting Semantic Role Contextualized Video Features For Multi-instance Text-video Retrieval EPIC-KITCHENS-100 Multi-instance Retrieval Challenge 2022
2022 Β· Burak Satar, Hongyuan Zhu, Hanwang Zhang, et al.
Abstract
In this report, we present our approach for EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022. We first parse sentences into semantic roles corresponding to verbs and nouns; then utilize self-attentions to exploit semantic role contextualized video features along with textual features via triplet losses in multiple embedding spaces. Our method overpasses the strong baseline in normalized Discounted Cumulative Gain (nDCG), which is more valuable for semantic similarity. Our submission is ranked 3rd for nDCG and ranked 4th for mAP.
Authors
(none)
Tags
Stats
Related papers
- Symmetric Multi-similarity Loss For EPIC-KITCHENS-100 Multi-instance Retrieval Challenge 2024 (2024)1.20
- Contextrefine-clip For EPIC-KITCHENS-100 Multi-instance Retrieval Challenge 2025 (2025)0.95
- Egocentric Video-language Pretraining @ EPIC-KITCHENS-100 Multi-instance Retrieval Challenge 2022 (2022)4.83
- Uniud-fbk-ub-unibz Submission To The Epic-kitchens-100 Multi-instance Retrieval Challenge 2022 (2022)0.00
- On Semantic Similarity In Video Retrieval (2021)12.81
- Uniud Submission To The Epic-kitchens-100 Multi-instance Retrieval Challenge 2023 (2023)0.00
- Multilevel Language And Vision Integration For Text-to-clip Retrieval (2018)17.67
- Semantic Role Aware Correlation Transformer For Text To Video Retrieval (2022)6.34