SimplerEnv
Emerging13papers using it
2025first seen
'SimplerEnv' is a benchmark dataset used to evaluate the performance of Vision-Language-Action models in a controlled environment.
Papers using SimplerEnv (13)
- TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-TransformersVLA-JEPA: Enhancing Vision-Language-Action Model with Latent World ModelStarVLA: A Lego-like Codebase for Vision-Language-Action Model DevelopingReFineVLA: Multimodal Reasoning-Aware Generalist Robotic Policies via Teacher-Guided Fine-TuningEfficient Long-Horizon Vision-Language-Action Models via Static-Dynamic DisentanglementUnified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion ProcessMAPS: Preserving Vision-Language Representations via Module-Wise Proximity Scheduling for Better Vision-Language-Action GeneralizationSpatial Traces: Enhancing VLA Models with Spatial-Temporal UnderstandingInstructVLA: Vision-Language-Action Instruction Tuning from Understanding to ManipulationSTORM: Search-guided Generative World Models For Robotic ManipulationTTF-VLA: Temporal Token Fusion Via Pixel-attention Integration For Vision-language-action ModelsSelf-improving Vision-language-action Models With Data Generation Via Residual RLManiagent: An Agentic Framework For General Robotic Manipulation