Training For Compositional Sensitivity Reduces Dense Retrieval Generalization
2026 Β· Radoslav Ralev, Aditeya Baral, Iliya Zhechev, et al.
Abstract
Dense retrieval compresses texts into single embeddings ranked by cosine similarity. While efficient for recall, this interface is brittle for identity-level matching: minimal compositional edits (negation, role swaps) flip meaning yet retain high similarity. Motivated by geometric results for unit-sphere cosine spaces (Kang et al., 2025), we test this retrieval-composition tension in text-only retrieval. Across four dual-encoder backbones, adding structure-targeted negatives consistently reduces zero-shot NanoBEIR retrieval (8-9% mean nDCG@10 drop on small backbones; up to 40% on medium ones), while only partially improving pooled-space separation. Treating pooled cosine as a recall interface, we then benchmark verifiers scoring token--token cosine maps. MaxSim (late interaction) excels at reranking but fails to reject structural near-misses, whereas a small Transformer over similarity maps reliably separates near-misses under end-to-end training.
Authors
(none)
Tags
Stats
Related papers
- BERM: Training The Balanced And Extractable Representation For Matching To Improve Generalization Ability Of Dense Retrieval (2023)5.84
- Dense Retrievers Can Fail On Simple Queries: Revealing The Granularity Dilemma Of Embeddings (2025)2.86
- Maximal Matching Matters: Preventing Representation Collapse For Robust Cross-modal Retrieval (2025)2.26
- Back To Basics: A Simple Recipe For Improving Out-of-domain Retrieval In Dense Encoders (2023)0.00
- Pylate: Flexible Training And Retrieval For Late Interaction Models (2025)3.58
- SCOT: Self-supervised Contrastive Pretraining For Zero-shot Compositional Retrieval (2025)0.00
- From Mapping To Composing: A Two-stage Framework For Zero-shot Composed Image Retrieval (2025)0.00
- Data-efficient Generalization For Zero-shot Composed Image Retrieval (2025)2.26