SPARK-IL: Spectral Retrieval-augmented RAG For Knowledge-driven Deepfake Detection Via Incremental Learning

Abstract

Detecting AI-generated images remains a significant challenge because detectors trained on specific generators often fail to generalize to unseen models; however, while pixel-level artifacts vary across models, frequency-domain signatures exhibit greater consistency, providing a promising foundation for cross-generator detection. To address this, we propose SPARK-IL, a retrieval-augmented framework that combines dual-path spectral analysis with incremental learning by utilizing a partially frozen ViT-L/14 encoder for semantic representations alongside a parallel path for raw RGB pixel embeddings. Both paths undergo multi-band Fourier decomposition into four frequency bands, which are individually processed by Kolmogorov-Arnold Networks (KAN) with mixture-of-experts for band-specific transformations before the resulting spectral embeddings are fused via cross-attention with residual connections. During inference, this fused embedding retrieves the \(k\) nearest labeled signatures from a

SPARK-IL: Spectral Retrieval-augmented RAG For Knowledge-driven Deepfake Detection Via Incremental Learning

Abstract

Authors

Tags

Stats

Related papers