Imagerag: Dynamic Image Retrieval For Reference-guided Image Generation
2025 Β· Rotem Shalev-Arkushin, Rinon Gal, Amit H. Bermano, et al.
Abstract
Diffusion models enable high-quality and diverse visual content synthesis. However, they struggle to generate rare or unseen concepts. To address this challenge, we explore the usage of Retrieval-Augmented Generation (RAG) with image generation models. We propose ImageRAG, a method that dynamically retrieves relevant images based on a given text prompt, and uses them as context to guide the generation process. Prior approaches that used retrieved images to improve generation, trained models specifically for retrieval-based generation. In contrast, ImageRAG leverages the capabilities of existing image conditioning models, and does not require RAG-specific training. Our approach is highly adaptable and can be applied across different model types, showing significant improvement in generating rare and fine-grained concepts using different base models. Our project page is available at: https://rotem-shalev.github.io/ImageRAG
Authors
(none)
Tags
Stats
Related papers
- Cross-modal RAG: Sub-dimensional Text-to-image Retrieval-augmented Generation (2025)0.00
- AR-RAG: Autoregressive Retrieval Augmentation For Image Generation (2025)0.00
- Realrag: Retrieval-augmented Realistic Image Generation Via Self-reflective Contrastive Learning (2025)0.00
- Visual-rag: Benchmarking Text-to-image Retrieval Augmented Generation For Visual Knowledge Intensive Queries (2025)0.00
- Text-guided Synthesis Of Artistic Images With Retrieval-augmented Diffusion Models (2022)8.29
- RAVID: Retrieval-augmented Visual Detection: A Knowledge-driven Approach For Ai-generated Image Identification (2025)0.00
- Regionrag: Region-level Retrieval-augmented Generation For Visual Document Understanding (2025)0.00
- Irgen: Generative Modeling For Image Retrieval (2023)7.16