The Generalization Gap In Offline Reinforcement Learning
2023 Β· Ishita Mediratta, Qingfei You, Minqi Jiang, et al.
Abstract
Despite recent progress in offline learning, these methods are still trained and tested on the same environment. In this paper, we compare the generalization abilities of widely used online and offline learning methods such as online reinforcement learning (RL), offline RL, sequence modeling, and behavioral cloning. Our experiments show that offline learning algorithms perform worse on new environments than online learning ones. We also introduce the first benchmark for evaluating generalization in offline learning, collecting datasets of varying sizes and skill-levels from Procgen (2D video games) and WebShop (e-commerce websites). The datasets contain trajectories for a limited number of game levels or natural language instructions and at test time, the agent has to generalize to new levels or instructions. Our experiments reveal that existing offline learning algorithms struggle to match the performance of online RL on both train and test environments. Behavioral cloning is a strong
Authors
(none)
Tags
Stats
Related papers
- Improving Zero-shot Generalization In Offline Reinforcement Learning Using Generalized Similarity Functions (2021)2.26
- Bridging The Gap Between Offline And Online Reinforcement Learning Evaluation Methodologies (2022)0.00
- Ensemble Successor Representations For Task Generalization In Offline-to-online Reinforcement Learning (2024)2.26
- Using Offline Data To Speed Up Reinforcement Learning In Procedurally Generated Environments (2023)6.77
- Assessing Generalization In Deep Reinforcement Learning (2018)0.00
- Domain Generalization For Robust Model-based Offline Reinforcement Learning (2022)0.00
- When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? (2022)0.00
- Offline Vs. Online Learning In Model-based RL: Lessons For Data Collection Strategies (2025)0.00