ImageNet-256 imagenet-256 Leaderboard
Auto-discovered from papers reporting ImageNet-256 (FID). Β· Metric: FID (lower is better)
| # | Model | FID | Paper |
|---|---|---|---|
| 1 | Boosting Latent Diffusion Models via Disentangled Representation Alignment | 1.21 | β |
| 2 | One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation | 1.29 | β |
| 3 | Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings | 1.56 | β |
| 4 | Guiding Token-Sparse Diffusion Models | 1.58 | β |
| 5 | There is No VAE: End-to-End Pixel-Space Generative Modeling via Self-Supervised Pre-training | 1.58 | β |
| 6 | PixelDiT: Pixel Diffusion Transformers for Image Generation | 1.61 | β |
| 7 | CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching | 1.68 | β |
| 8 | Rethinking Cross-Layer Information Routing in Diffusion Transformers | 2.11 | β |
| 9 | Scalable GANs with Transformers | 2.96 | β |
| 10 | Terminal Velocity Matching | 3.29 | β |
| 11 | Mean Flows for One-step Generative Modeling | 3.43 | β |
| 12 | PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss | 5.11 | β |
| 13 | Guiding a Diffusion Transformer with the Internal Dynamics of Itself | 5.31 | β |