Complementing Lexical Retrieval With Semantic Residual Embedding
2020 Β· Luyu Gao, Zhuyun Dai, Tongfei Chen, et al.
Abstract
This paper presents CLEAR, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model. CLEAR explicitly trains the neural embedding to encode language structures and semantics that lexical retrieval fails to capture with a novel residual-based embedding learning method. Empirical evaluations demonstrate the advantages of CLEAR over state-of-the-art retrieval models, and that it can substantially improve the end-to-end accuracy and efficiency of reranking pipelines.
Authors
(none)
Tags
Stats
Related papers
- Lexsembridge: Fine-grained Dense Representation Enhancement Through Token-aware Embedding Augmentation (2025)2.35
- CLEAR: Cross-lingual Enhancement In Alignment Via Reverse-training (2026)0.78
- Llm-augmented Retrieval: Enhancing Retrieval Models Through Language Models And Doc-level Embedding (2024)0.00
- A Dense Representation Framework For Lexical And Semantic Matching (2022)11.13
- Large Reasoning Embedding Models: Towards Next-generation Dense Retrieval Paradigm (2025)0.00
- Rzenembed: Towards Comprehensive Multimodal Retrieval (2025)0.00
- Vectorsearch: Enhancing Document Retrieval With Semantic Embeddings And Optimized Search (2024)0.00
- Evaluating Embedding Apis For Information Retrieval (2023)8.09