A Comparative Study Of Specialized Llms As Dense Retrievers
2025 Β· Hengran Zhang, Keping Bi, Jiafeng Guo
Abstract
While large language models (LLMs) are increasingly deployed as dense retrievers, the impact of their domain-specific specialization on retrieval effectiveness remains underexplored. This investigation systematically examines how task-specific adaptations in LLMs influence their retrieval capabilities, an essential step toward developing unified retrievers capable of handling text, code, images, and multimodal content. We conduct extensive experiments with eight Qwen2.5 7B LLMs, including base, instruction-tuned, code/math-specialized, long reasoning, and vision-language models across zero-shot retrieval settings and the supervised setting. For the zero-shot retrieval settings, we consider text retrieval from the BEIR benchmark and code retrieval from the CoIR benchmark. Further, to evaluate supervised performance, all LLMs are fine-tuned on the MS MARCO dataset. We find that mathematical specialization and the long reasoning capability cause consistent degradation in three settings, i
Authors
(none)
Tags
Stats
Related papers
- Scaling Sparse And Dense Retrieval In Decoder-only Llms (2025)6.34
- SLQ: Bridging Modalities Via Shared Latent Queries For Retrieval With Frozen Mllms (2026)0.00
- Making Large Language Models Efficient Dense Retrievers (2025)0.00
- Mm-embed: Universal Multimodal Retrieval With Multimodal Llms (2024)0.00
- Lightretriever: A Llm-based Text Retrieval Architecture With Extremely Faster Query Inference (2025)0.00
- CSPLADE: Learned Sparse Retrieval With Causal Language Models (2025)0.00
- Expandr: Teaching Dense Retrievers Beyond Queries With LLM Guidance (2025)3.25
- Transforming Llms Into Cross-modal And Cross-lingual Retrieval Systems (2024)4.52