Misspecification In Inverse Reinforcement Learning
2022 Β· Joar Skalse, Alessandro Abate
Abstract
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function \(R\) from a policy \(\pi\). To do this, we need a model of how \(\pi\) relates to \(R\). In the current literature, the most common models are optimality, Boltzmann rationality, and causal entropy maximisation. One of the primary motivations behind IRL is to infer human preferences from human behaviour. However, the true relationship between human preferences and human behaviour is much more complex than any of the models currently used in IRL. This means that they are misspecified, which raises the worry that they might lead to unsound inferences if applied to real-world data. In this paper, we provide a mathematical analysis of how robust different IRL models are to misspecification, and answer precisely how the demonstrator policy may differ from each of the standard models before that model leads to faulty inferences about the reward function \(R\). We also introduce a framework for reasoning about missp
Authors
(none)
Tags
Stats
Related papers
- Partial Identifiability And Misspecification In Inverse Reinforcement Learning (2024)0.00
- Quantifying The Sensitivity Of Inverse Reinforcement Learning To Misspecification (2024)0.00
- Towards Theoretical Understanding Of Inverse Reinforcement Learning (2023)0.00
- Accounting For Human Learning When Inferring Human Preferences (2020)0.00
- Model Selection For Inverse Reinforcement Learning Via Structural Risk Minimization (2023)0.00
- Modeling And Interpreting Real-world Human Risk Decision Making With Inverse Reinforcement Learning (2019)0.00
- A Survey Of Inverse Reinforcement Learning: Challenges, Methods And Progress (2018)0.00
- Inverse Reinforcement Learning With Simultaneous Estimation Of Rewards And Dynamics (2016)0.00