Quantifying The Sensitivity Of Inverse Reinforcement Learning To Misspecification
2024 Β· Joar Skalse, Alessandro Abate
Abstract
Inverse reinforcement learning (IRL) aims to infer an agent's preferences (represented as a reward function \(R\)) from their behaviour (represented as a policy \(\pi\)). To do this, we need a behavioural model of how \(\pi\) relates to \(R\). In the current literature, the most common behavioural models are optimality, Boltzmann-rationality, and causal entropy maximisation. However, the true relationship between a human's preferences and their behaviour is much more complex than any of these behavioural models. This means that the behavioural models are misspecified, which raises the concern that they may lead to systematic errors if applied to real data. In this paper, we analyse how sensitive the IRL problem is to misspecification of the behavioural model. Specifically, we provide necessary and sufficient conditions that completely characterise how the observed data may differ from the assumed behavioural model without incurring an error above a given threshold. In addition to this,
Authors
(none)
Tags
Stats
Related papers
- Misspecification In Inverse Reinforcement Learning (2022)5.24
- Partial Identifiability And Misspecification In Inverse Reinforcement Learning (2024)0.00
- On The Model-misspecification In Reinforcement Learning (2023)0.00
- Model Selection For Inverse Reinforcement Learning Via Structural Risk Minimization (2023)0.00
- Towards Theoretical Understanding Of Inverse Reinforcement Learning (2023)0.00
- Accounting For Human Learning When Inferring Human Preferences (2020)0.00
- A Survey Of Inverse Reinforcement Learning: Challenges, Methods And Progress (2018)0.00
- Inverse Reinforcement Learning With Simultaneous Estimation Of Rewards And Dynamics (2016)0.00