Partial Identifiability And Misspecification In Inverse Reinforcement Learning
2024 Β· Joar Skalse, Alessandro Abate
Abstract
The aim of Inverse Reinforcement Learning (IRL) is to infer a reward function \(R\) from a policy \(\pi\). This problem is difficult, for several reasons. First of all, there are typically multiple reward functions which are compatible with a given policy; this means that the reward function is only *partially identifiable*, and that IRL contains a certain fundamental degree of ambiguity. Secondly, in order to infer \(R\) from \(\pi\), an IRL algorithm must have a *behavioural model* of how \(\pi\) relates to \(R\). However, the true relationship between human preferences and human behaviour is very complex, and practically impossible to fully capture with a simple model. This means that the behavioural model in practice will be *misspecified*, which raises the worry that it might lead to unsound inferences if applied to real-world data. In this paper, we provide a comprehensive mathematical analysis of partial identifiability and misspecification in IRL. Specifically, we fully charact
Authors
(none)
Tags
Stats
Related papers
- Misspecification In Inverse Reinforcement Learning (2022)5.24
- Quantifying The Sensitivity Of Inverse Reinforcement Learning To Misspecification (2024)0.00
- Towards Theoretical Understanding Of Inverse Reinforcement Learning (2023)0.00
- A Survey Of Inverse Reinforcement Learning: Challenges, Methods And Progress (2018)0.00
- Inverse Reinforcement Learning With Simultaneous Estimation Of Rewards And Dynamics (2016)0.00
- Maximum-likelihood Inverse Reinforcement Learning With Finite-time Guarantees (2022)0.00
- Task-guided Inverse Reinforcement Learning Under Partial Information (2021)0.00
- Inverse Reinforcement Learning Without Reinforcement Learning (2023)0.00