How Do Offline Measures For Exploration In Reinforcement Learning Behave?
2020 Β· Jakob J. Hollenstein, Sayantan Auddy, Matteo Saveriano, et al.
Abstract
Sufficient exploration is paramount for the success of a reinforcement learning agent. Yet, exploration is rarely assessed in an algorithm-independent way. We compare the behavior of three data-based, offline exploration metrics described in the literature on intuitive simple distributions and highlight problems to be aware of when using them. We propose a fourth metric,uniform relative entropy, and implement it using either a k-nearest-neighbor or a nearest-neighbor-ratio estimator, highlighting that the implementation choices have a profound impact on these measures.
Authors
(none)
Tags
Stats
Related papers
- A Dataset Perspective On Offline Reinforcement Learning (2021)0.00
- Behavioral Entropy-guided Dataset Generation For Offline Reinforcement Learning (2025)0.00
- Maximum Entropy Exploration Without The Rollouts (2026)0.00
- Don't Change The Algorithm, Change The Data: Exploratory Data For Offline Reinforcement Learning (2022)0.00
- Offline Meta Learning Of Exploration (2020)0.00
- Exploration By Random Distribution Distillation (2025)0.00
- Revisiting Design Choices In Offline Model-based Reinforcement Learning (2021)6.34
- Measuring Data Quality For Dataset Selection In Offline Reinforcement Learning (2021)0.00