Erarag: Efficient And Incremental Retrieval Augmented Generation For Growing Corpora
2025 Β· Fangyuan Zhang, Zhengjun Huang, Yingli Zhou, et al.
Abstract
Graph-based Retrieval-Augmented Generation (Graph-RAG) enhances large language models (LLMs) by structuring retrieval over an external corpus. However, existing approaches typically assume a static corpus, requiring expensive full-graph reconstruction whenever new documents arrive, limiting their scalability in dynamic, evolving environments. To address these limitations, we introduce EraRAG, a novel multi-layered Graph-RAG framework that supports efficient and scalable dynamic updates. Our method leverages hyperplane-based Locality-Sensitive Hashing (LSH) to partition and organize the original corpus into hierarchical graph structures, enabling efficient and localized insertions of new data without disrupting the existing topology. The design eliminates the need for retraining or costly recomputation while preserving high retrieval accuracy and low latency. Experiments on large-scale benchmarks demonstrate that EraRag achieves up to an order of magnitude reduction in update time and t
Authors
(none)
Tags
Stats
Related papers
- Hetarag: Hybrid Deep Retrieval-augmented Generation Across Heterogeneous Data Stores (2025)3.27
- Advancing Retrieval-augmented Generation For Structured Enterprise And Internal Data (2025)1.20
- MG\(^2\)-RAG: Multi-granularity Graph For Multimodal Retrieval-augmented Generation (2026)0.00
- Slimrag: Retrieval Without Graphs Via Entity-aware Context Selection (2025)1.91
- Universalrag: Retrieval-augmented Generation Over Corpora Of Diverse Modalities And Granularities (2025)0.00
- Funnelrag: A Coarse-to-fine Progressive Retrieval Paradigm For RAG (2024)3.58
- Domain-aware RAG: Mol-enhanced RL For Efficient Training And Scalable Retrieval (2025)0.00
- Regionrag: Region-level Retrieval-augmented Generation For Visual Document Understanding (2025)0.00