Benchmarking Partial Observability In Reinforcement Learning With A Suite Of Memory-improvable Domains
2025 Β· Ruo Yu Tao, Kaicheng Guo, Cameron Allen, et al.
Abstract
Mitigating partial observability is a necessary but challenging task for general reinforcement learning algorithms. To improve an algorithm's ability to mitigate partial observability, researchers need comprehensive benchmarks to gauge progress. Most algorithms tackling partial observability are only evaluated on benchmarks with simple forms of state aliasing, such as feature masking and Gaussian noise. Such benchmarks do not represent the many forms of partial observability seen in real domains, like visual occlusion or unknown opponent intent. We argue that a partially observable benchmark should have two key properties. The first is coverage in its forms of partial observability, to ensure an algorithm's generalizability. The second is a large gap between the performance of a agents with more or less state information, all other factors roughly equal. This gap implies that an environment is memory improvable: where performance gains in a domain are from an algorithm's ability to cop
Authors
(none)
Tags
Stats
Related papers
- Reinforcement Learning Under Partial Observability Guided By Learned Environment Models (2022)6.34
- Unbiased Asymmetric Reinforcement Learning Under Partial Observability (2021)2.26
- Provable Partially Observable Reinforcement Learning With Privileged Information (2024)2.26
- The Act Of Remembering: A Study In Partially Observable Reinforcement Learning (2020)0.00
- On Overfitting And Asymptotic Bias In Batch Reinforcement Learning With Partial Observability (2017)9.23
- Provable Representation With Efficient Planning For Partial Observable Reinforcement Learning (2023)0.00
- Provably Efficient Reinforcement Learning In Partially Observable Dynamical Systems (2022)0.00
- Partially Observable RL With B-stability: Unified Structural Condition And Sharp Sample-efficient Algorithms (2022)0.00