SIMPLER Benchmark
Emerging7papers using it
2025first seen
The SIMPLER Benchmark is a dataset used to evaluate the performance of Vision-Language Action models in robotic manipulation tasks by assessing their ability to generate desired action tokens while minimizing the influence of distracting image tokens.
Papers using SIMPLER Benchmark (7)
- DTP: A Simple Yet Effective Distracting Token Pruning Framework For Vision-language Action ModelsDAM-VLA: A Dynamic Action Model-Based Vision-Language-Action Framework for Robot ManipulationBeyond Attention Magnitude: Leveraging Inter-layer Rank Consistency for Efficient Vision-Language-Action ModelsMVP-LAM: Learning Action-Centric Latent Action via Cross-Viewpoint ReconstructionDepthVLA: Enhancing Vision-Language-Action Models with Depth-Aware Spatial ReasoningVilla-x: Enhancing Latent Action Modeling In Vision-language-action ModelsVLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching