Imitation From Observation With Bootstrapped Contrastive Learning
2023 Β· Medric Sonwa, Johanna Hansen, Eugene Belilovsky
Abstract
Imitation from observation (IfO) is a learning paradigm that consists of training autonomous agents in a Markov Decision Process (MDP) by observing expert demonstrations without access to its actions. These demonstrations could be sequences of environment states or raw visual observations of the environment. Recent work in IfO has focused on this problem in the case of observations of low-dimensional environment states, however, access to these highly-specific observations is unlikely in practice. In this paper, we adopt a challenging, but more realistic problem formulation, learning control policies that operate on a learned latent space with access only to visual demonstrations of an expert completing a task. We present BootIfOL, an IfO algorithm that aims to learn a reward function that takes an agent trajectory and compares it to an expert, providing rewards based on similarity to agent behavior and implicit goal. We consider this reward function to be a distance metric between tra
Authors
(none)
Tags
Stats
Related papers
- DEALIO: Data-efficient Adversarial Learning For Imitation From Observation (2021)5.24
- Imitation Learning From Observation Through Optimal Transport (2023)2.26
- Imitation Learning From Observation With Automatic Discount Scheduling (2023)0.00
- A Dual Approach To Imitation Learning From Observations With Offline Datasets (2024)0.00
- State-only Imitation With Transition Dynamics Mismatch (2020)0.00
- A Bayesian Solution To The Imitation Gap (2024)0.00
- Imitation Learning From Observations By Minimizing Inverse Dynamics Disagreement (2019)0.00
- Causal Imitation Learning With Unobserved Confounders (2022)0.00