Awesome Databases
Databases is one of the most active areas in Awesome AI Agents β 53 papers in this collection, evaluated on datasets like SQuAD, MultiHop-RAG, DataCrossBench. A strong starting point is "M3: Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis".
Datasets & benchmarks
Key papers
- M3: Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis (2025)Rafi Al Attrach et al.5.72
- MetaConfigurator: AI-Assisted RDF Authoring from JSON Data (2026)Felix Neubauer et al.5.01
- Ensembles of Large Language Models for Identifying EQ-5D Studies in PubMed Based on Their Abstracts (2026)Zhyar Rzgar K. Rostam et al.5.01
- A BART-based approach with hierarchical strategy for Vietnamese abstractive multi-document summarization (2026)Vu Nguyen Nguyen Xuan et al.5.01
- VCG: A Multimodal Retrieval Framework for E-Commerce Video Feeds under Extreme Cold-Start Conditions (2026)Katya Mirylenka et al.5.01
- FineREX: Fine-Tuned NER-RE for Human Smuggling Knowledge Graphs (2026)Elijah Feldman et al.5.01
- Fast LLM-Based Semantic Filtering: From a Unified Framework to an Adaptive Two-Phase Method (2026)Kyoungmin Kim et al.4.39
- Graph2Idea:Retrieval-Augmented Scientific Idea Generation with Graph-Structured Contexts (2026)Xu Li et al.4.39
- Hyperdimensional computing for structured querying on tabular data embeddings (2026)Sebasti\'an Bugedo et al.4.39
- Semantics-Enhanced Retrieval-Augmented Time Series Forecasting (2026)Shiqiao Zhou et al.4.39
- Few-Shot Biomedical Relation Extraction with Large Language Models: A Viable Alternative to Supervised Learning? (2026)Jakob Mraz et al.4.39
- Ricci-Filtration: Boosting Retrieval-Augmented Generation Reranker to Query-Answer Tasks by Discrete Ricci Flow (2026)Tian Qin et al.4.39
- AthDGC: An Open Diachronic Greek Treebank with Indo-European Parallels (2026)Nikolaos Lavidas et al.4.39
- A Unified Framework for Context-Aware and Relation-Aware Graph Retrieval-Augmented Generation (2026)Haoyang Zhong et al.4.39
- GLARE: A Natural Language Interface for Querying Global Explanations (2026)Bhavan Vasu et al.4.39
- Policy-aware Vector Search: A Vision for Fine Grained Access Control in Vector Databases (2026)Lakshmi Sahithi Yalamarthi et al.4.39
- Implicit Semantic-Aware Communication Based on Hypergraph Reasoning (2026)Yiwei Liao et al.4.39
- CHRONOS: Temporally-Aware Multi-Agent Coordination for Evolving Data Marketplaces (2026)Joydeep Chandra4.33
- LECTOR: Joint Optimization of Scientific Reasoning Graphs and Introduction Generation (2026)Jiabei Xiao et al.4.33
- DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering (2025)Rong Cheng et al.3.75
- DataEvolver: Automatic Data Preparation for Large Language Models through Multi-Level Self-Evolving (2026)Chao Deng et al.3.51
- Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting (2026)Amritansh Maurya et al.3.45
- Cost-Efficient RAG for Entity Matching with LLMs: A Blocking-based Exploration (2026)Chuangtao Ma et al.3.27
- An Entity Linking Agent for Question Answering (2025)Yajie Luo et al.3.04
- Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding (2025)Sakhinana Sagar Srinivas et al.2.28
- LLM-TabLogic: Preserving Inter-Column Logical Relationships in Synthetic Tabular Data via Prompt-Guided Latent Diffusion (2026)Yunbo Long et al.2.00
- Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction (2026)A H M Rezaul Karim et al.2.00
- ChartDesign: Towards LLM Designer of Data Visualization (2026)Mohammed Afaan Ansari et al.2.00
- HPC-LLM: Practical Domain Adaptation and Retrieval-Augmented Generation for HPC Support (2026)Nourin Shahin et al.2.00
- LERA: LLM-Enhanced RAG for Ad Auction in Generative Chatbots (2026)Haoran Sun et al.2.00
- LogRouter: Adaptive Two-Level LLM Routing for Log Question Answering in Big Data Systems (2026)Mert Coskuner et al.2.00
- Hyrax: An Extensible Framework for Rapid ML Experimentation and Unsupervised Discovery in the Era of Rubin, Roman, and Euclid (2026)Aritra Ghosh et al.2.00
- DeepJEB++: Foundation Model-Driven Large-Scale 3D Engineering Dataset via 2D Latent Space Augmentation (2026)Soyoung Yoo et al.2.00
- Retrieval-augmented Reasoning For Chartered Accountancy (2026)Jatin Gupta, Akhil Sharma, Saransh Singhania, et al.2.00
- Pexa: Parallel Exploration Agent For Complex Text-to-sql (2026)Tanmay Parekh, Ella Hofmann-Coyle, Shuyi Wang, et al.2.00
- Dingent: An Easily Deployable Database Retrieval and Integration Agent framework (2026)Demian Kong et al.2.00
- Siriushelper: An LLM Agent-based Operations Assistant For Big Data Platforms (2026)Yu Shen, Shiyang Liu, Qihang He, et al.2.00
- Webaggregator: Enhancing Compositional Reasoning Capabilities Of Deep Research Agent Foundation Models (2026)Rui Wang, Ce Zhang, Jun-Yu Ma, et al.2.00
- Self-reinforcing Controllable Synthesis Of Rare Relational Data Via Bayesian Calibration (2026)Chongsheng Zhang, Hao Wang, Zelong Yu, et al.2.00
- RAGnaroX: A Secure, Local-Hosted ChatOps Assistant Using Small Language Models (2026)Benedikt Dornauer et al.1.89
- DataCross: A Unified Benchmark and Agent Framework for Cross-Modal Heterogeneous Data Analysis (2026)Ruyi Qi et al.1.72
- Bid Farewell to Seesaw: Towards Accurate Long-tail Session-based Recommendation via Dual Constraints of Hybrid Intents (2025)Xiao Wang et al.1.61
- AI for Distributed Systems Design: Scalable Cloud Optimization Through Repeated LLMs Sampling And Simulators (2025)Jacopo Tagliabue1.56
- Relation-Aware Bayesian Optimization of DBMS Configurations Guided by Affinity Scores (2025)Sein Kwon et al.1.56
- Language-Native Materials Processing Design by Lightly Structured Text Database and Reasoning Large Language Model (2025)Yuze Liu et al.1.50
- Inference Scaled GraphRAG: Improving Multi Hop Question Answering on Knowledge Graphs (2025)Travis Thompson et al.1.33
- Towards Effective Federated Graph Foundation Model via Mitigating Knowledge Entanglement (2025)Yinlin Zhu et al.1.28
- Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization (2025)Yutao Zhu et al.1.28
- KnowPath: Knowledge-enhanced Reasoning via LLM-generated Inference Paths over Knowledge Graphs (2025)Qi Zhao et al.1.11
- Learning Index Selection with Structured Action Spaces (2019)Jeremy Welborn et al.β
- Good Intentions: Adaptive Parameter Management via Intent Signaling (2022)Alexander Renz-Wieland et al.β
- InfuserKI: Enhancing Large Language Models with Knowledge Graphs via
Infuser-Guided Knowledge Integration (2024)Fali Wang et al.β
- IA2: Leveraging Instance-Aware Index Advisor with Reinforcement Learning
for Diverse Workloads (2024)Taiyi Wang et al.β