Distributional Robustness And Regularization In Reinforcement Learning
2020 Β· Esther Derman, Shie Mannor
Abstract
Distributionally Robust Optimization (DRO) has enabled to prove the equivalence between robustness and regularization in classification and regression, thus providing an analytical reason why regularization generalizes well in statistical learning. Although DRO's extension to sequential decision-making overcomes \(\textit\{external uncertainty\}\) through the robust Markov Decision Process (MDP) setting, the resulting formulation is hard to solve, especially on large domains. On the other hand, existing regularization methods in reinforcement learning only address \(\textit\{internal uncertainty\}\) due to stochasticity. Our study aims to facilitate robust reinforcement learning by establishing a dual relation between robust MDPs and regularization. We introduce Wasserstein distributionally robust MDPs and prove that they hold out-of-sample performance guarantees. Then, we introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributi
Authors
(none)
Tags
Stats
Related papers
- The Curious Price Of Distributional Robustness In Reinforcement Learning With A Generative Model (2023)0.00
- On The Foundation Of Distributionally Robust Reinforcement Learning (2023)0.00
- Twice Regularized Mdps And The Equivalence Between Robustness And Regularization (2021)0.00
- Twice Regularized Markov Decision Processes: The Equivalence Between Robustness And Regularization (2023)0.00
- Improving Robustness Via Risk Averse Distributional Reinforcement Learning (2020)0.00
- Wasserstein Distributionally Robust Regret Optimization For Reinforcement Learning From Human Feedback (2026)0.00
- Doubly Robust Distributionally Robust Off-policy Evaluation And Learning (2022)0.00
- Distributional Reinforcement Learning With Regularized Wasserstein Loss (2022)0.00