Cleanba: A Reproducible And Efficient Distributed Reinforcement Learning Platform
2023 · Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, et al.
Abstract
Distributed Deep Reinforcement Learning (DRL) aims to leverage more computational resources to train autonomous agents with less training time. Despite recent progress in the field, reproducibility issues have not been sufficiently explored. This paper first shows that the typical actor-learner framework can have reproducibility issues even if hyperparameters are controlled. We then introduce Cleanba, a new open-source platform for distributed DRL that proposes a highly reproducible architecture. Cleanba implements highly optimized distributed variants of PPO and IMPALA. Our Atari experiments show that these variants can obtain equivalent or higher scores than strong IMPALA baselines in moolib and torchbeast and PPO baseline in CleanRL. However, Cleanba variants present 1) shorter training time and 2) more reproducible learning curves in different hardware settings. Cleanba's source code is available at https://github.com/vwxyzjn/cleanba
Authors
(none)
Tags
Stats
Code
Related papers
- Cleanrl: High-quality Single-file Implementations Of Deep Reinforcement Learning Algorithms (2021)0.00
- Distributed Deep Reinforcement Learning: An Overview (2020)0.00
- Rllib: Abstractions For Distributed Reinforcement Learning (2017)0.00
- Is Deep Reinforcement Learning Really Superhuman On Atari? Leveling The Playing Field (2019)0.00
- Data Efficient Training For Reinforcement Learning With Adaptive Behavior Policy Sharing (2020)0.00
- Robust Domain Randomised Reinforcement Learning Through Peer-to-peer Distillation (2020)0.00
- SLM Lab: A Comprehensive Benchmark And Modular Software Framework For Reproducible Deep Reinforcement Learning (2019)0.00
- Reliable Validation Of Reinforcement Learning Benchmarks (2022)0.00