Imitation Learning From Observation Through Optimal Transport
2023 Β· Wei-di Chang, Scott Fujimoto, David Meger, et al.
Abstract
Imitation Learning from Observation (ILfO) is a setting in which a learner tries to imitate the behavior of an expert, using only observational data and without the direct guidance of demonstrated actions. In this paper, we re-examine optimal transport for IL, in which a reward is generated based on the Wasserstein distance between the state trajectories of the learner and expert. We show that existing methods can be simplified to generate a reward function without requiring learned models or adversarial learning. Unlike many other state-of-the-art methods, our approach can be integrated with any RL algorithm and is amenable to ILfO. We demonstrate the effectiveness of this simple approach on a variety of continuous control tasks and find that it surpasses the state of the art in the IlfO setting, achieving expert-level performance across a range of evaluation domains even when observing only a single expert trajectory without actions.
Authors
(none)
Tags
Stats
Related papers
- Provably Efficient Imitation Learning From Observation Alone (2019)0.00
- Imitation Learning From Observation With Automatic Discount Scheduling (2023)0.00
- Is Optimal Transport Necessary For Inverse Reinforcement Learning? (2025)0.00
- Primal Wasserstein Imitation Learning (2020)0.00
- Imitation From Observation With Bootstrapped Contrastive Learning (2023)0.00
- Understanding Reward Ambiguity Through Optimal Transport Theory In Inverse Reinforcement Learning (2023)0.00
- Towards Generalisable Imitation Learning Through Conditioned Transition Estimation And Online Behaviour Alignment (2026)0.00
- Can Optimal Transport Improve Federated Inverse Reinforcement Learning? (2026)0.00