State-only Imitation With Transition Dynamics Mismatch
2020 Β· Tanmay Gangwani, Jian Peng
Abstract
Imitation Learning (IL) is a popular paradigm for training agents to achieve complicated goals by leveraging expert behavior, rather than dealing with the hardships of designing a correct reward function. With the environment modeled as a Markov Decision Process (MDP), most of the existing IL algorithms are contingent on the availability of expert demonstrations in the same MDP as the one in which a new imitator policy is to be learned. This is uncharacteristic of many real-life scenarios where discrepancies between the expert and the imitator MDPs are common, especially in the transition dynamics function. Furthermore, obtaining expert actions may be costly or infeasible, making the recent trend towards state-only IL (where expert demonstrations constitute only states or observations) ever so promising. Building on recent adversarial imitation approaches that are motivated by the idea of divergence minimization, we present a new state-only IL algorithm in this paper. It divides the ov
Authors
(none)
Tags
Stats
Related papers
- Conditional Kernel Imitation Learning For Continuous State Environments (2023)0.00
- Provably Efficient Adversarial Imitation Learning With Unknown Transitions (2023)0.00
- Toward The Fundamental Limits Of Imitation Learning (2020)0.00
- A Bayesian Solution To The Imitation Gap (2024)0.00
- Plan Your Target And Learn Your Skills: Transferable State-only Imitation Learning Via Decoupled Policy Optimization (2022)0.00
- Mitigating Covariate Shift In Imitation Learning Via Offline Data Without Great Coverage (2021)0.00
- A Simple Solution For Offline Imitation From Observations And Examples With Possibly Incomplete Trajectories (2023)0.00
- Interactive And Hybrid Imitation Learning: Provably Beating Behavior Cloning (2024)0.00