Robust Model-based Reinforcement Learning With An Adversarial Auxiliary Model
2024 Β· Siemen Herremans, Ali Anwar, Siegfried Mercelis
Abstract
Reinforcement learning has demonstrated impressive performance in various challenging problems such as robotics, board games, and classical arcade games. However, its real-world applications can be hindered by the absence of robustness and safety in the learned policies. More specifically, an RL agent that trains in a certain Markov decision process (MDP) often struggles to perform well in nearly identical MDPs. To address this issue, we employ the framework of Robust MDPs (RMDPs) in a model-based setting and introduce a novel learned transition model. Our method specifically incorporates an auxiliary pessimistic model, updated adversarially, to estimate the worst-case MDP within a Kullback-Leibler uncertainty set. In comparison to several existing works, our work does not impose any additional conditions on the training environment, such as the need for a parametric simulator. To test the effectiveness of the proposed pessimistic model in enhancing policy robustness, we integrate it i
Authors
(none)
Tags
Stats
Related papers
- Combining Pessimism With Optimism For Robust And Efficient Model-based Deep Reinforcement Learning (2021)0.00
- Safe Reinforcement Learning With Dual Robustness (2023)8.60
- Online Robust Policy Learning In The Presence Of Unknown Adversaries (2018)0.00
- Robust Lagrangian And Adversarial Policy Gradient For Robust Constrained Markov Decision Processes (2023)2.26
- A Bayesian Approach To Robust Reinforcement Learning (2019)0.00
- Robust Deep Reinforcement Learning Against Adversarial Perturbations On State Observations (2020)0.00
- Robust Reinforcement Learning On State Observations With Learned Optimal Adversary (2021)0.00
- Robust Adversarial Policy Optimization Under Dynamics Uncertainty (2026)0.00