Imitation Learning From Observation With Automatic Discount Scheduling
2023 Β· Yuyang Liu, Weijun Dong, Yingdong Hu, et al.
Abstract
Humans often acquire new skills through observation and imitation. For robotic agents, learning from the plethora of unlabeled video demonstration data available on the Internet necessitates imitating the expert without access to its action, presenting a challenge known as Imitation Learning from Observations (ILfO). A common approach to tackle ILfO problems is to convert them into inverse reinforcement learning problems, utilizing a proxy reward computed from the agent's and the expert's observations. Nonetheless, we identify that tasks characterized by a progress dependency property pose significant challenges for such approaches; in these tasks, the agent needs to initially learn the expert's preceding behaviors before mastering the subsequent ones. Our investigation reveals that the main cause is that the reward signals assigned to later steps hinder the learning of initial behaviors. To address this challenge, we present a novel ILfO framework that enables the agent to master earl
Authors
(none)
Tags
Stats
Related papers
- Imitation Learning From Observation Through Optimal Transport (2023)2.26
- A Dual Approach To Imitation Learning From Observations With Offline Datasets (2024)0.00
- RLIF: Interactive Imitation Learning As Reinforcement Learning (2023)0.00
- Provably Efficient Imitation Learning From Observation Alone (2019)0.00
- Imitation From Observation With Bootstrapped Contrastive Learning (2023)0.00
- Towards Inverse Reinforcement Learning For Limit Order Book Dynamics (2019)0.00
- Co-imitation Learning Without Expert Demonstration (2021)0.00
- Inverse Reinforcement Learning Without Reinforcement Learning (2023)0.00