Backward Imitation And Forward Reinforcement Learning Via Bi-directional Model Rollouts
2022 Β· Yuxin Pan, Fangzhen Lin
Abstract
Traditional model-based reinforcement learning (RL) methods generate forward rollout traces using the learnt dynamics model to reduce interactions with the real environment. The recent model-based RL method considers the way to learn a backward model that specifies the conditional probability of the previous state given the previous action and the current state to additionally generate backward rollout trajectories. However, in this type of model-based method, the samples derived from backward rollouts and those from forward rollouts are simply aggregated together to optimize the policy via the model-free RL algorithm, which may decrease both the sample efficiency and the convergence rate. This is because such an approach ignores the fact that backward rollout traces are often generated starting from some high-value states and are certainly more instructive for the agent to improve the behavior. In this paper, we propose the backward imitation and forward reinforcement learning (BIFRL)
Authors
(none)
Tags
Stats
Related papers
- Model Imitation For Model-based Reinforcement Learning (2019)0.00
- Double Check Your State Before Trusting It: Confidence-aware Bidirectional Offline Model-based Imagination (2022)0.00
- Double Horizon Model-based Policy Optimization (2025)0.00
- Inferential Induction: A Novel Framework For Bayesian Reinforcement Learning (2020)0.00
- Guided Cooperation In Hierarchical Reinforcement Learning Via Model-based Rollout (2023)0.00
- Backward Curriculum Reinforcement Learning (2022)0.00
- Barc: Backward Reachability Curriculum For Robotic Reinforcement Learning (2018)10.74
- Reverse Forward Curriculum Learning For Extreme Sample And Demonstration Efficiency In Reinforcement Learning (2024)0.00