Reinforcement Learning With Non-ergodic Reward Increments: Robustness Via Ergodicity Transformations
2023 Β· Dominik Baumann, Erfaun Noorani, James Price, et al.
Abstract
Envisioned application areas for reinforcement learning (RL) include autonomous driving, precision agriculture, and finance, which all require RL agents to make decisions in the real world. A significant challenge hindering the adoption of RL methods in these domains is the non-robustness of conventional algorithms. In particular, the focus of RL is typically on the expected value of the return. The expected value is the average over the statistical ensemble of infinitely many trajectories, which can be uninformative about the performance of the average individual. For instance, when we have a heavy-tailed return distribution, the ensemble average can be dominated by rare extreme events. Consequently, optimizing the expected value can lead to policies that yield exceptionally high returns with a probability that approaches zero but almost surely result in catastrophic outcomes in single long trajectories. In this paper, we develop an algorithm that lets RL agents optimize the long-term
Authors
(none)
Tags
Stats
Related papers
- Model-agnostic Solutions For Deep Reinforcement Learning In Non-ergodic Contexts (2026)0.00
- Value Enhancement Of Reinforcement Learning Via Efficient And Robust Trust Region Optimization (2023)0.00
- Enhancing Robustness In Deep Reinforcement Learning: A Lyapunov Exponent Approach (2024)0.00
- Demystifying Reinforcement Learning In Time-varying Systems (2022)0.00
- Maximum Entropy RL (provably) Solves Some Robust RL Problems (2021)0.00
- Moments Matter:stabilizing Policy Optimization Using Return Distributions (2026)0.00
- Disturbing Reinforcement Learning Agents With Corrupted Rewards (2021)0.00
- Sample-efficient Robust Multi-agent Reinforcement Learning In The Face Of Environmental Uncertainty (2024)0.00