Diffusion Art Or Digital Forgery? Investigating Data Replication In Diffusion Models
2022 Β· Gowthami Somepalli, Vasu Singla, Micah Goldblum, et al.
Abstract
Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes. But do diffusion models create unique works of art, or are they replicating content directly from their training sets? In this work, we study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated. Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication. We also identify cases where diffusion models, including the popular Stable Diffusion model, blatantly copy from their training data.
Authors
(none)
Tags
Stats
Related papers
- Diffusion Models Generate Images Like Painters: An Analytical Theory Of Outline First, Details Later (2023)0.00
- Image Retrieval Outperforms Diffusion Models On Data Augmentation (2023)0.00
- Text-guided Synthesis Of Artistic Images With Retrieval-augmented Diffusion Models (2022)8.29
- Where's Waldo: Diffusion Features For Personalized Segmentation And Retrieval (2024)0.00
- Genetic Algorithms For The Optimization Of Diffusion Parameters In Content-based Image Retrieval (2019)9.23
- Dare To Plagiarize? Plagiarized Painting Recognition And Retrieval (2025)0.00
- Text-to-image Diffusion Models Are Great Sketch-photo Matchmakers (2024)9.41
- Adafuse: Adaptive Diffusion-generated Image And Text Fusion For Interactive Text-to-image Retrieval (2026)0.00