Behavioral Entropy-guided Dataset Generation For Offline Reinforcement Learning
2025 Β· Wesley A. Suttle, Aamodh Suresh, Carlos Nieto-Granda
Abstract
Entropy-based objectives are widely used to perform state space exploration in reinforcement learning (RL) and dataset generation for offline RL. Behavioral entropy (BE), a rigorous generalization of classical entropies that incorporates cognitive and perceptual biases of agents, was recently proposed for discrete settings and shown to be a promising metric for robotic exploration problems. In this work, we propose using BE as a principled exploration objective for systematically generating datasets that provide diverse state space coverage in complex, continuous, potentially high-dimensional domains. To achieve this, we extend the notion of BE to continuous settings, derive tractable \(k\)-nearest neighbor estimators, provide theoretical guarantees for these estimators, and develop practical reward functions that can be used with standard RL methods to learn BE-maximizing policies. Using standard MuJoCo environments, we experimentally compare the performance of offline RL algorithms f
Authors
(none)
Tags
Stats
Related papers
- A Dataset Perspective On Offline Reinforcement Learning (2021)0.00
- Learning-driven Exploration For Reinforcement Learning (2019)6.45
- Surprise-adaptive Intrinsic Motivation For Unsupervised Reinforcement Learning (2024)0.00
- Behavior Estimation From Multi-source Data For Offline Reinforcement Learning (2022)2.26
- How Do Offline Measures For Exploration In Reinforcement Learning Behave? (2020)0.00
- Interpretable Performance Analysis Towards Offline Reinforcement Learning: A Dataset Perspective (2021)0.00
- Don't Change The Algorithm, Change The Data: Exploratory Data For Offline Reinforcement Learning (2022)0.00
- Maximum Entropy Exploration Without The Rollouts (2026)0.00