On Discovering Algorithms For Adversarial Imitation Learning
2025 Β· Shashank Reddy Chirra, Jayden Teoh, Praveen Paruchuri, et al.
Abstract
Adversarial Imitation Learning (AIL) methods, while effective in settings with limited expert demonstrations, are often considered unstable. These approaches typically decompose into two components: Density Ratio (DR) estimation \(\frac\{\rho_E\}\{\rho_\{\pi\}\}\), where a discriminator estimates the relative occupancy of state-action pairs under the policy versus the expert; and Reward Assignment (RA), where this ratio is transformed into a reward signal used to train the policy. While significant research has focused on improving density estimation, the role of reward assignment in influencing training dynamics and final policy performance has been largely overlooked. RA functions in AIL are typically derived from divergence minimization objectives, relying heavily on human design and ingenuity. In this work, we take a different approach: we investigate the discovery of data-driven RA functions, i.e, based directly on the performance of the resulting imitation policy. To this end, we
Authors
(none)
Tags
Stats
Related papers
- Diffail: Diffusion Adversarial Imitation Learning (2023)9.13
- Discriminator-actor-critic: Addressing Sample Inefficiency And Reward Bias In Adversarial Imitation Learning (2018)0.00
- ARC - Actor Residual Critic For Adversarial Imitation Learning (2022)0.00
- Provably Efficient Adversarial Imitation Learning With Unknown Transitions (2023)0.00
- Rethinking Adversarial Inverse Reinforcement Learning: Policy Imitation, Transferable Reward Recovery And Algebraic Equilibrium Proof (2024)0.00
- Provably Efficient Off-policy Adversarial Imitation Learning With Convergence Guarantees (2024)0.00
- Mimicking Better By Matching The Approximate Action Distribution (2023)0.00
- Imitating Opponent To Win: Adversarial Policy Imitation Learning In Two-player Competitive Games (2022)0.00