Interpretable Performance Analysis Towards Offline Reinforcement Learning: A Dataset Perspective
2021 Β· Chenyang Xi, Bo Tang, Jiajun Shen, et al.
Abstract
Offline reinforcement learning (RL) has increasingly become the focus of the artificial intelligent research due to its wide real-world applications where the collection of data may be difficult, time-consuming, or costly. In this paper, we first propose a two-fold taxonomy for existing offline RL algorithms from the perspective of exploration and exploitation tendency. Secondly, we derive the explicit expression of the upper bound of extrapolation error and explore the correlation between the performance of different types of algorithms and the distribution of actions under states. Specifically, we relax the strict assumption on the sufficiently large amount of state-action tuples. Accordingly, we provably explain why batch constrained Q-learning (BCQ) performs better than other existing techniques. Thirdly, after identifying the weakness of BCQ on dataset of low mean episode returns, we propose a modified variant based on top return selection mechanism, which is proved to be able to
Authors
(none)
Tags
Stats
Related papers
- A Dataset Perspective On Offline Reinforcement Learning (2021)0.00
- An Optimistic Perspective On Offline Reinforcement Learning (2019)0.00
- Optimality Inductive Biases And Agnostic Guidelines For Offline Reinforcement Learning (2021)0.00
- Expert Or Not? Assessing Data Quality In Offline Reinforcement Learning (2025)0.00
- Fewer May Be Better: Enhancing Offline Reinforcement Learning With Reduced Dataset (2025)0.00
- Bridging The Gap Between Offline And Online Reinforcement Learning Evaluation Methodologies (2022)0.00
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Don't Change The Algorithm, Change The Data: Exploratory Data For Offline Reinforcement Learning (2022)0.00