#ModelFIDPaper
1Boosting Latent Diffusion Models via Disentangled Representation Alignment1.21β€”
2One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation1.29β€”
3Soft-Di[M]O: Improving One-Step Discrete Image Generation with Soft Embeddings1.56β€”
4Guiding Token-Sparse Diffusion Models1.58β€”
5There is No VAE: End-to-End Pixel-Space Generative Modeling via Self-Supervised Pre-training1.58β€”
6PixelDiT: Pixel Diffusion Transformers for Image Generation1.61β€”
7CAR-Flow: Condition-Aware Reparameterization Aligns Source and Target for Better Flow Matching1.68β€”
8Rethinking Cross-Layer Information Routing in Diffusion Transformers2.11β€”
9Scalable GANs with Transformers2.96β€”
10Terminal Velocity Matching3.29β€”
11Mean Flows for One-step Generative Modeling3.43β€”
12PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss5.11β€”
13Guiding a Diffusion Transformer with the Internal Dynamics of Itself5.31β€”
ImageNet-256 imagenet-256 Leaderboard