Image Retrieval Outperforms Diffusion Models On Data Augmentation
2023 Β· Max F. Burg, Florian Wenzel, Dominik Zietlow, et al.
Abstract
Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize enough to improve over directly using the additional data of their pre-training process for augmentation. We systematically evaluate a range of existing methods to generate images from diffusion models and study new extensions to assess their benefit for data augmentation. Personalizing diffusion models towards the target data outperforms simpler prompting strategies. However, using the pre-training data of the diffusion model alone, via a simple nearest-neighbor retrieval procedure, leads to even stronger downstream performance. Our study explores the potential of diffusion models in generatin
Authors
(none)
Tags
Stats
Related papers
- Daug: Diffusion-based Channel Augmentation For Radiology Image Retrieval And Classification (2024)0.00
- Where's Waldo: Diffusion Features For Personalized Segmentation And Retrieval (2024)0.00
- Text-guided Synthesis Of Artistic Images With Retrieval-augmented Diffusion Models (2022)8.29
- A Review Of Image Retrieval Techniques: Data Augmentation And Adversarial Learning Approaches (2024)0.00
- Diffusion Art Or Digital Forgery? Investigating Data Replication In Diffusion Models (2022)15.75
- Genetic Algorithms For The Optimization Of Diffusion Parameters In Content-based Image Retrieval (2019)9.23
- MV-RAG: Retrieval Augmented Multiview Diffusion (2025)0.00
- Text-to-image Diffusion Models Are Great Sketch-photo Matchmakers (2024)9.41