A Dynamic Retrieval-augmented Generation System With Selective Memory And Remembrance

Abstract

We introduce *Adaptive RAG Memory* (ARM), a retrieval-augmented generation (RAG) framework that replaces a static vector index with a *dynamic* memory substrate governed by selective remembrance and decay. Frequently retrieved items are consolidated and protected from forgetting, while rarely used items gradually decay, inspired by cognitive consolidation and forgetting principles. On a lightweight retrieval benchmark, ARM reaches near state-of-the-art performance (e.g., NDCG@5 \(\approx\) 0.940, Recall@5 \(=1.000\)) with only \(\sim\)22M parameters in the embedding layer, achieving the best efficiency among ultra-efficient models (\(<\)25M parameters). In addition, we compare static vs. dynamic RAG combinations across Llama 3.1 and GPT-4o. Llama 3.1 with static RAG achieves the highest key-term coverage (67.2%) at moderate latency, while GPT-4o with a dynamic selective retrieval policy attains the fastest responses (8.2s on average) with competitive coverage (58.7%). We further presen

A Dynamic Retrieval-augmented Generation System With Selective Memory And Remembrance

Abstract

Authors

Tags

Stats

Related papers