Bayesian Robust Optimization For Imitation Learning
2020 Β· Daniel S. Brown, Scott Niekum, Marek Petrik
Abstract
One of the main challenges in imitation learning is determining what action an agent should take when outside the state distribution of the demonstrations. Inverse reinforcement learning (IRL) can enable generalization to new states by learning a parameterized reward function, but these approaches still face uncertainty over the true reward function and corresponding optimal policy. Existing safe imitation learning approaches based on IRL deal with this uncertainty using a maxmin framework that optimizes a policy under the assumption of an adversarial reward function, whereas risk-neutral IRL approaches either optimize a policy for the mean or MAP reward function. While completely ignoring risk can lead to overly aggressive and unsafe policies, optimizing in a fully adversarial sense is also problematic as it can lead to overly conservative policies that perform poorly in practice. To provide a bridge between these two extremes, we propose Bayesian Robust Optimization for Imitation Lea
Authors
(none)
Tags
Stats
Related papers
- Policy Gradient Bayesian Robust Optimization For Imitation Learning (2021)0.00
- A Bayesian Solution To The Imitation Gap (2024)0.00
- A Bayesian Approach To Robust Inverse Reinforcement Learning (2023)0.00
- Active Learning For Risk-sensitive Inverse Reinforcement Learning (2019)0.00
- Robust Model-free Reinforcement Learning With Multi-objective Bayesian Optimization (2019)11.08
- Maximum-likelihood Inverse Reinforcement Learning With Finite-time Guarantees (2022)0.00
- Towards Theoretical Understanding Of Inverse Reinforcement Learning (2023)0.00
- Blending Imitation And Reinforcement Learning For Robust Policy Improvement (2023)0.00