Let's Play Again: Variability Of Deep Reinforcement Learning Agents In Atari Environments
2019 Β· Kaleigh Clary, Emma Tosch, John Foley, et al.
Abstract
Reproducibility in reinforcement learning is challenging: uncontrolled stochasticity from many sources, such as the learning algorithm, the learned policy, and the environment itself have led researchers to report the performance of learned agents using aggregate metrics of performance over multiple random seeds for a single environment. Unfortunately, there are still pernicious sources of variability in reinforcement learning agents that make reporting common summary statistics an unsound metric for performance. Our experiments demonstrate the variability of common agents used in the popular OpenAI Baselines repository. We make the case for reporting post-training agent performance as a distribution, rather than a point estimate.
Authors
(none)
Tags
Stats
Related papers
- Beyond Expected Return: Accounting For Policy Reproducibility When Evaluating Reinforcement Learning Algorithms (2023)3.58
- Reproducibility Of Benchmarked Deep Reinforcement Learning Tasks For Continuous Control (2017)0.00
- Deep Reinforcement Learning At The Edge Of The Statistical Precipice (2021)0.00
- Is Deep Reinforcement Learning Really Superhuman On Atari? Leveling The Playing Field (2019)0.00
- Probing Transfer In Deep Reinforcement Learning Without Task Engineering (2022)0.00
- Hackatari: Atari Learning Environments For Robust And Continual Reinforcement Learning (2024)0.00
- Quantifying The Effects Of Environment And Population Diversity In Multi-agent Reinforcement Learning (2021)9.03
- Is High Variance Unavoidable In RL? A Case Study In Continuous Control (2021)0.00