Data-efficient Pipeline For Offline Reinforcement Learning With Limited Data
2022 Β· Allen Nie, Yannis Flet-Berliac, Deon R. Jordan, et al.
Abstract
Offline reinforcement learning (RL) can be used to improve future performance by leveraging historical data. There exist many different algorithms for offline RL, and it is well recognized that these algorithms, and their hyperparameter settings, can lead to decision policies with substantially differing performance. This prompts the need for pipelines that allow practitioners to systematically perform algorithm-hyperparameter selection for their setting. Critically, in most real-world settings, this pipeline must only involve the use of historical data. Inspired by statistical model selection methods for supervised learning, we introduce a task- and method-agnostic pipeline for automatically training, comparing, selecting, and deploying the best policy when the provided dataset is limited in size. In particular, our work highlights the importance of performing multiple data splits to produce more reliable algorithm-hyperparameter selection. While this is a common approach in supervise
Authors
(none)
Tags
Stats
Related papers
- Fewer May Be Better: Enhancing Offline Reinforcement Learning With Reduced Dataset (2025)0.00
- AWAC: Accelerating Online Reinforcement Learning With Offline Datasets (2020)0.00
- Towards Data-driven Offline Simulations For Online Reinforcement Learning (2022)0.00
- Data Valuation For Offline Reinforcement Learning (2022)0.00
- Beyond Uniform Sampling: Offline Reinforcement Learning With Imbalanced Datasets (2023)2.83
- Don't Change The Algorithm, Change The Data: Exploratory Data For Offline Reinforcement Learning (2022)0.00
- Using Offline Data To Speed Up Reinforcement Learning In Procedurally Generated Environments (2023)6.77
- Leveraging Offline Data In Online Reinforcement Learning (2022)0.00