One Risk To Rule Them All: A Risk-sensitive Perspective On Model-based Offline Reinforcement Learning
2022 Β· Marc Rigter, Bruno Lacerda, Nick Hawes
Abstract
Offline reinforcement learning (RL) is suitable for safety-critical domains where online exploration is too costly or dangerous. In such safety-critical settings, decision-making should take into consideration the risk of catastrophic outcomes. In other words, decision-making should be risk-sensitive. Previous works on risk in offline RL combine together offline RL techniques, to avoid distributional shift, with risk-sensitive RL algorithms, to achieve risk-sensitivity. In this work, we propose risk-sensitivity as a mechanism to jointly address both of these issues. Our model-based approach is risk-averse to both epistemic and aleatoric uncertainty. Risk-aversion to epistemic uncertainty prevents distributional shift, as areas not covered by the dataset have high epistemic uncertainty. Risk-aversion to aleatoric uncertainty discourages actions that may result in poor outcomes due to environment stochasticity. Our experiments show that our algorithm achieves competitive performance on d
Authors
(none)
Tags
Stats
Related papers
- DRL-ORA: Distributional Reinforcement Learning With Online Risk Adaption (2023)0.00
- Expert-supervised Reinforcement Learning For Offline Policy Learning And Evaluation (2020)0.00
- Distributionally Robust Model-based Offline Reinforcement Learning With Near-optimal Sample Complexity (2022)0.00
- Revisiting Design Choices In Offline Model-based Reinforcement Learning (2021)6.34
- Long-horizon Model-based Offline Reinforcement Learning Without Conservatism (2025)0.00
- Online Bayesian Risk-averse Reinforcement Learning (2025)0.00
- Constraints Penalized Q-learning For Safe Offline Reinforcement Learning (2021)0.00
- Bridging Distributionally Robust Learning And Offline RL: An Approach To Mitigate Distribution Shift And Partial Data Coverage (2023)0.00