#ModelFIDPaper
1Unified Latents (UL): How to train your latents1.40β€”
2PixelDiT: Pixel Diffusion Transformers for Image Generation1.81β€”
3There is No VAE: End-to-End Pixel-Space Generative Modeling via Self-Supervised Pre-training2.35β€”
4Terminal Velocity Matching4.32β€”
ImageNet-512x512 imagenet-512x512 Leaderboard