Inverse Reinforcement Learning With Explicit Policy Estimates
2021 Β· Navyata Sanghvi, Shinnosuke Usami, Mohit Sharma, et al.
Abstract
Various methods for solving the inverse reinforcement learning (IRL) problem have been developed independently in machine learning and economics. In particular, the method of Maximum Causal Entropy IRL is based on the perspective of entropy maximization, while related advances in the field of economics instead assume the existence of unobserved action shocks to explain expert behavior (Nested Fixed Point Algorithm, Conditional Choice Probability method, Nested Pseudo-Likelihood Algorithm). In this work, we make previously unknown connections between these related methods from both fields. We achieve this by showing that they all belong to a class of optimization problems, characterized by a common form of the objective, the associated policy and the objective gradient. We demonstrate key computational and algorithmic differences which arise between the methods due to an approximation of the optimal soft value function, and describe how this leads to more efficient algorithms. Using ins
Authors
(none)
Tags
Stats
Related papers
- Maximum-likelihood Inverse Reinforcement Learning With Finite-time Guarantees (2022)0.00
- Inverse Reinforcement Learning With Simultaneous Estimation Of Rewards And Dynamics (2016)0.00
- Inverse Reinforcement Learning Using Just Classification And A Few Regressions (2025)0.00
- Inverse Reinforcement Learning Without Reinforcement Learning (2023)0.00
- Towards Theoretical Understanding Of Inverse Reinforcement Learning (2023)0.00
- Statistical Analysis Of Inverse Entropy-regularized Reinforcement Learning (2025)0.00
- A Survey Of Inverse Reinforcement Learning: Challenges, Methods And Progress (2018)0.00
- Efficient Inference For Inverse Reinforcement Learning And Dynamic Discrete Choice Models (2025)0.00