Efficient Context Selection For Long-context QA: No Tuning, No Iteration, Just Adaptive-\(k\)
2025 Β· Chihiro Taguchi, Seiji Maekawa, Nikita Bhutani
Abstract
Retrieval-augmented generation (RAG) and long-context language models (LCLMs) both address context limitations of LLMs in open-domain question answering (QA). However, optimal external context to retrieve remains an open problem: fixing the retrieval size risks either wasting tokens or omitting key evidence. Existing adaptive methods like Self-RAG and Self-Route rely on iterative LLM prompting and perform well on factoid QA, but struggle with aggregation QA, where the optimal context size is both unknown and variable. We present Adaptive-\(k\) retrieval, a simple and effective single-pass method that adaptively selects the number of passages based on the distribution of the similarity scores between the query and the candidate passages. It does not require model fine-tuning, extra LLM inferences or changes to existing retriever-reader pipelines. On both factoid and aggregation QA benchmarks, Adaptive-\(k\) matches or outperforms fixed-\(k\) baselines while using up to 10x fewer tokens
Authors
(none)
Tags
Stats
Related papers
- Re-ranking The Context For Multimodal Retrieval Augmented Generation (2025)0.00
- Optimizing Retrieval-augmented Generation: Analysis Of Hyperparameter Impact On Performance And Efficiency (2025)0.00
- You Only Use Reactive Attention Slice For Long Context Retrieval (2024)0.00
- Optimizing Retrieval For RAG Via Reinforcement Learning (2025)0.00
- SV-RAG: Lora-contextualizing Adaptation Of Mllms For Long Document Understanding (2024)0.00
- A Systematic Study Of Retrieval Pipeline Design For Retrieval-augmented Medical Question Answering (2026)0.00
- RAG Without Forgetting: Continual Query-infused Key Memory (2026)0.00
- Ragsmith: A Framework For Finding The Optimal Composition Of Retrieval-augmented Generation Methods Across Datasets (2025)0.00