SAGE: Structure Aware Graph Expansion For Retrieval Of Heterogeneous Data
2026 Β· Prasham Titiya, Rohit Khoja, Tomer Wolfson, et al.
Abstract
Retrieval-augmented question answering over heterogeneous corpora requires connected evidence across text, tables, and graph nodes. While entity-level knowledge graphs support structured access, they are costly to construct and maintain, and inefficient to traverse at query time. In contrast, standard retriever-reader pipelines use flat similarity search over independently chunked text, missing multi-hop evidence chains across modalities. We propose SAGE (Structure Aware Graph Expansion) framework that (i) constructs a chunk-level graph offline using metadata-driven similarities with percentile-based pruning, and (ii) performs online retrieval by running an initial baseline retriever to obtain k seed chunks, expanding first-hop neighbors, and then filtering the neighbors using dense+sparse retrieval, selecting k' additional chunks. We instantiate the initial retriever using hybrid dense+sparse retrieval for implicit cross-modal corpora and SPARK (Structure Aware Planning Agent for Retr
Authors
(none)
Tags
Stats
Related papers
- Hetarag: Hybrid Deep Retrieval-augmented Generation Across Heterogeneous Data Stores (2025)3.27
- Multimodal RAG For Unstructured Data:leveraging Modality-aware Knowledge Graphs With Hybrid Retrieval (2025)0.00
- Slimrag: Retrieval Without Graphs Via Entity-aware Context Selection (2025)1.91
- Erarag: Efficient And Incremental Retrieval Augmented Generation For Growing Corpora (2025)4.51
- Imagine All The Relevance: Scenario-profiled Indexing With Knowledge Expansion For Dense Retrieval (2025)0.00
- MG\(^2\)-RAG: Multi-granularity Graph For Multimodal Retrieval-augmented Generation (2026)0.00
- SAGE: Spatial-visual Adaptive Graph Exploration For Efficient Visual Place Recognition (2025)2.16
- A Reference Architecture For Agentic Hybrid Retrieval In Dataset Search (2026)0.00