Mfollowir: A Multilingual Benchmark For Instruction Following In Retrieval
2025 Β· Orion Weller, Benjamin Chang, Eugene Yang, et al.
Abstract
Retrieval systems generally focus on web-style queries that are short and underspecified. However, advances in language models have facilitated the nascent rise of retrieval models that can understand more complex queries with diverse intents. However, these efforts have focused exclusively on English; therefore, we do not yet understand how they work across languages. We introduce mFollowIR, a multilingual benchmark for measuring instruction-following ability in retrieval models. mFollowIR builds upon the TREC NeuCLIR narratives (or instructions) that span three diverse languages (Russian, Chinese, Persian) giving both query and instruction to the retrieval models. We make small changes to the narratives and isolate how well retrieval models can follow these nuanced changes. We present results for both multilingual (XX-XX) and cross-lingual (En-XX) performance. We see strong cross-lingual performance with English-based retrievers that trained using instructions, but find a notable dro
Authors
(none)
Tags
Stats
Related papers
- Neuclirbench: A Modern Evaluation Collection For Monolingual, Cross-language, And Multilingual Information Retrieval (2025)0.00
- Towards Better Instruction Following Retrieval Models (2025)0.00
- Uniir: Training And Benchmarking Universal Multimodal Information Retrievers (2023)10.48
- MAIR: A Massive Benchmark For Evaluating Instructed Retrieval (2024)6.41
- Dual-view Training For Instruction-following Information Retrieval (2026)0.00
- Can Instructed Retrieval Models Really Support Exploration? (2026)0.00
- What Drives Cross-lingual Ranking? Retrieval Approaches With Multilingual Language Models (2025)0.00
- Bridging Language Gaps: Advances In Cross-lingual Information Retrieval With Multilingual Llms (2025)0.00