Enrichindex: Using Llms To Enrich Retrieval Indices Offline
2025 Β· Peter Baile Chen, Tomer Wolfson, Michael Cafarella, et al.
Abstract
Existing information retrieval systems excel in cases where the language of target documents closely matches that of the user query. However, real-world retrieval systems are often required to implicitly reason whether a document is relevant. For example, when retrieving technical texts or tables, their relevance to the user query may be implied through a particular jargon or structure, rather than explicitly expressed in their content. Large language models (LLMs) hold great potential in identifying such implied relevance by leveraging their reasoning skills. Nevertheless, current LLM-augmented retrieval is hindered by high latency and computation cost, as the LLM typically computes the query-document relevance online, for every query anew. To tackle this issue we introduce EnrichIndex, a retrieval approach which instead uses the LLM offline to build semantically-enriched retrieval indices, by performing a single pass over all documents in the retrieval corpus once during ingestion ti
Authors
(none)
Tags
Stats
Related papers
- Lightretriever: A Llm-based Text Retrieval Architecture With Extremely Faster Query Inference (2025)0.00
- Llm-augmented Retrieval: Enhancing Retrieval Models Through Language Models And Doc-level Embedding (2024)0.00
- Scalingnote: Scaling Up Retrievers With Large Language Models For Real-world Dense Retrieval (2024)0.00
- Expandr: Teaching Dense Retrievers Beyond Queries With LLM Guidance (2025)3.25
- Improving Tool Retrieval By Leveraging Large Language Models For Query Generation (2024)0.00
- Bridging Language Gaps: Advances In Cross-lingual Information Retrieval With Multilingual Llms (2025)0.00
- Retrieval-enhanced Machine Learning (2022)11.93
- An Interactive Multi-modal Query Answering System With Retrieval-augmented Large Language Models (2024)5.84