LunarLander-v-2

Name: LunarLander-v-2
License: mit

Emerging

5papers using it

38HF downloads

1HF likes

2024first seen

LunarLander-v2 - Imitation Learning Datasets This is a dataset created by Imitation Learning Datasets project. It was created by using Stable Baselines weights from a PPO policy from HuggingFace. Description The dataset consists of 1,000 episodes with an average episodic reward of 500. Each entry consists of: obs (list

🤗 Hugging Face⚖ mit

Papers using LunarLander-v-2 (5)

Not All Transitions Matter: Evidence from PPO2026

Directed-MAML: Meta Reinforcement Learning Algorithm with Task-directed Approximation2025 · 1 cites

Representation over Routing: Diagnosing Temporal Routing Pathologies in Multi-Timescale PPO2026

PrefPoE: Advantage-Guided Preference Fusion for Learning Where to Explore2025

Optimal Policy Sparsification and Low Rank Decomposition for Deep Reinforcement Learning2024 · 1 cites