Non-adversarial Imitation Learning And Its Connections To Adversarial Methods
2020 Β· Oleg Arenz, Gerhard Neumann
Abstract
Many modern methods for imitation learning and inverse reinforcement learning, such as GAIL or AIRL, are based on an adversarial formulation. These methods apply GANs to match the expert's distribution over states and actions with the implicit state-action distribution induced by the agent's policy. However, by framing imitation learning as a saddle point problem, adversarial methods can suffer from unstable optimization, and convergence can only be shown for small policy updates. We address these problems by proposing a framework for non-adversarial imitation learning. The resulting algorithms are similar to their adversarial counterparts and, thus, provide insights for adversarial imitation learning methods. Most notably, we show that AIRL is an instance of our non-adversarial formulation, which enables us to greatly simplify its derivations and obtain stronger convergence guarantees. We also show that our non-adversarial formulation can be used to derive novel algorithms by presenti
Authors
(none)
Tags
Stats
Related papers
- Rethinking Adversarial Inverse Reinforcement Learning: Policy Imitation, Transferable Reward Recovery And Algebraic Equilibrium Proof (2024)0.00
- Generative Adversarial Imitation Learning (2016)0.00
- Learning Robust Rewards With Adversarial Inverse Reinforcement Learning (2017)0.00
- C-GAIL: Stabilizing Generative Adversarial Imitation Learning With Control Theory (2024)0.00
- When Will Generative Adversarial Imitation Learning Algorithms Attain Global Convergence (2020)0.00
- A Pragmatic Look At Deep Imitation Learning (2021)0.00
- On Discovering Algorithms For Adversarial Imitation Learning (2025)0.00
- Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning (2020)7.81