ImageNet-256 imagenet-256 Leaderboard

#	Model	FID	Paper
1	Boosting Latent Diffusion Models via Disentangled Representation Alignment	1.21	—
2	One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation	1.29	—
3	Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings	1.56	—
4	Guiding Token-Sparse Diffusion Models	1.58	—
5	There is No VAE: End-to-End Pixel-Space Generative Modeling via Self-Supervised Pre-training	1.58	—
6	PixelDiT: Pixel Diffusion Transformers for Image Generation	1.61	—
7	CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching	1.68	—
8	Rethinking Cross-Layer Information Routing in Diffusion Transformers	2.11	—
9	Scalable GANs with Transformers	2.96	—
10	Terminal Velocity Matching	3.29	—
11	Mean Flows for One-step Generative Modeling	3.43	—
12	PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss	5.11	—
13	Guiding a Diffusion Transformer with the Internal Dynamics of Itself	5.31	—

ImageNet-256 imagenet-256 Leaderboard