Inverse Reinforcement Learning Using Just Classification And A Few Regressions
2025 · Lars van Der Laan, Nathan Kallus, Aurélien Bibaut
Abstract
Inverse reinforcement learning (IRL) aims to explain observed behavior by uncovering an underlying reward. In the maximum-entropy or Gumbel-shocks-to-reward frameworks, this amounts to fitting a reward function and a soft value function that together satisfy the soft Bellman consistency condition and maximize the likelihood of observed actions. While this perspective has had enormous impact in imitation learning for robotics and understanding dynamic choices in economics, practical learning algorithms often involve delicate inner-loop optimization, repeated dynamic programming, or adversarial training, all of which complicate the use of modern, highly expressive function approximators like neural nets and boosting. We revisit softmax IRL and show that the population maximum-likelihood solution is characterized by a linear fixed-point equation involving the behavior policy. This observation reduces IRL to two off-the-shelf supervised learning problems: probabilistic classification to es
Authors
(none)
Tags
Stats
Related papers
- Inverse Reinforcement Learning With Explicit Policy Estimates (2021)2.26
- Towards Theoretical Understanding Of Inverse Reinforcement Learning (2023)0.00
- Inverse Reinforcement Learning With Simultaneous Estimation Of Rewards And Dynamics (2016)0.00
- Maximum-likelihood Inverse Reinforcement Learning With Finite-time Guarantees (2022)0.00
- Inverse Reinforcement Learning Without Reinforcement Learning (2023)0.00
- A Survey Of Inverse Reinforcement Learning: Challenges, Methods And Progress (2018)0.00
- Statistical Analysis Of Inverse Entropy-regularized Reinforcement Learning (2025)0.00
- Efficient Inference For Inverse Reinforcement Learning And Dynamic Discrete Choice Models (2025)0.00