Walking The Values In Bayesian Inverse Reinforcement Learning
2024 Β· Ondrej Bajgar, Alessandro Abate, Konstantinos Gatsis, et al.
Abstract
The goal of Bayesian inverse reinforcement learning (IRL) is recovering a posterior distribution over reward functions using a set of demonstrations from an expert optimizing for a reward unknown to the learner. The resulting posterior over rewards can then be used to synthesize an apprentice policy that performs well on the same or a similar task. A key challenge in Bayesian IRL is bridging the computational gap between the hypothesis space of possible rewards and the likelihood, often defined in terms of Q values: vanilla Bayesian IRL needs to solve the costly forward planning problem - going from rewards to the Q values - at every step of the algorithm, which may need to be done thousands of times. We propose to solve this by a simple change: instead of focusing on primarily sampling in the space of rewards, we can focus on primarily working in the space of Q-values, since the computation required to go from Q-values to reward is radically cheaper. Furthermore, this reversion of the
Authors
(none)
Tags
Stats
Related papers
- Kernel Density Bayesian Inverse Reinforcement Learning (2023)0.00
- Towards Theoretical Understanding Of Inverse Reinforcement Learning (2023)0.00
- Inverse Reinforcement Learning Without Reinforcement Learning (2023)0.00
- Is Inverse Reinforcement Learning Harder Than Standard Reinforcement Learning? A Theoretical Perspective (2023)0.00
- Basis For Intentions: Efficient Inverse Reinforcement Learning Using Past Experience (2022)0.00
- A Survey Of Inverse Reinforcement Learning: Challenges, Methods And Progress (2018)0.00
- Active Exploration For Inverse Reinforcement Learning (2022)0.00
- Maximum-likelihood Inverse Reinforcement Learning With Finite-time Guarantees (2022)0.00