A Bayesian Solution To The Imitation Gap
2024 Β· Risto Vuorio, Mattie Fellows, Cong Lu, et al.
Abstract
In many real-world settings, an agent must learn to act in environments where no reward signal can be specified, but a set of expert demonstrations is available. Imitation learning (IL) is a popular framework for learning policies from such demonstrations. However, in some cases, differences in observability between the expert and the agent can give rise to an imitation gap such that the expert's policy is not optimal for the agent and a naive application of IL can fail catastrophically. In particular, if the expert observes the Markov state and the agent does not, then the expert will not demonstrate the information-gathering behavior needed by the agent but not the expert. In this paper, we propose a Bayesian solution to the Imitation Gap (BIG), first using the expert demonstrations, together with a prior specifying the cost of exploratory behavior that is not demonstrated, to infer a posterior over rewards with Bayesian inverse reinforcement learning (IRL). BIG then uses the reward
Authors
(none)
Tags
Stats
Related papers
- Bayesian Robust Optimization For Imitation Learning (2020)0.00
- Good Better Best: Self-motivated Imitation Learning For Noisy Demonstrations (2023)0.00
- State-only Imitation With Transition Dynamics Mismatch (2020)0.00
- Toward The Fundamental Limits Of Imitation Learning (2020)0.00
- Fully General Online Imitation Learning (2021)0.00
- Walking The Values In Bayesian Inverse Reinforcement Learning (2024)0.00
- Matching Multiple Experts: On The Exploitability Of Multi-agent Imitation Learning (2026)0.00
- Kernel Density Bayesian Inverse Reinforcement Learning (2023)0.00