Tight Bayesian Ambiguity Sets For Robust Mdps
2018 Β· Reazul Hasan Russel, Marek Petrik
Abstract
Robustness is important for sequential decision making in a stochastic dynamic environment with uncertain probabilistic parameters. We address the problem of using robust MDPs (RMDPs) to compute policies with provable worst-case guarantees in reinforcement learning. The quality and robustness of an RMDP solution is determined by its ambiguity set. Existing methods construct ambiguity sets that lead to impractically conservative solutions. In this paper, we propose RSVF, which achieves less conservative solutions with the same worst-case guarantees by 1) leveraging a Bayesian prior, 2) optimizing the size and location of the ambiguity set, and, most importantly, 3) relaxing the requirement that the set is a confidence interval. Our theoretical analysis shows the safety of RSVF, and the empirical results demonstrate its practical promise.
Authors
(none)
Tags
Stats
Related papers
- Robust Risk-sensitive Reinforcement Learning With Conditional Value-at-risk (2024)5.84
- Sample Complexity Of Robust Reinforcement Learning With A Generative Model (2021)0.00
- A Bayesian Approach To Robust Reinforcement Learning (2019)0.00
- Lyapunov Robust Constrained-mdps: Soft-constrained Robustly Stable Policy Optimization Under Model Uncertainty (2021)0.00
- Solving Non-rectangular Reward-robust Mdps Via Frequency Regularization (2023)0.00
- Robust Anytime Learning Of Markov Decision Processes (2022)0.00
- Solving Robust Mdps Through No-regret Dynamics (2023)0.00
- Efficient Policy Optimization In Robust Constrained Mdps With Iteration Complexity Guarantees (2025)0.00