Bridging The Gap Between Offline And Online Reinforcement Learning Evaluation Methodologies
2022 Β· Shivakanth Sujit, Pedro H. M. Braga, Jorg Bornschein, et al.
Abstract
Reinforcement learning (RL) has shown great promise with algorithms learning in environments with large state and action spaces purely from scalar reward signals. A crucial challenge for current deep RL algorithms is that they require a tremendous amount of environment interactions for learning. This can be infeasible in situations where such interactions are expensive; such as in robotics. Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning. While online RL algorithms are typically evaluated as a function of the number of environment interactions, there exists no single established protocol for evaluating offline RL methods.In this paper, we propose a sequential approach to evaluate offline RL algorithms as a function of the training set size and thus by their data efficiency. Sequential evaluation provides valuable insights into the data efficiency of t
Authors
(none)
Tags
Stats
Related papers
- Using Offline Data To Speed Up Reinforcement Learning In Procedurally Generated Environments (2023)6.77
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Towards Data-driven Offline Simulations For Online Reinforcement Learning (2022)0.00
- Data Valuation For Offline Reinforcement Learning (2022)0.00
- The Generalization Gap In Offline Reinforcement Learning (2023)0.00
- Leveraging Offline Data In Online Reinforcement Learning (2022)0.00
- Representation Matters: Offline Pretraining For Sequential Decision Making (2021)0.00
- Bridging Offline Reinforcement Learning And Imitation Learning: A Tale Of Pessimism (2021)0.00