Pretraining Deep Actor-critic Reinforcement Learning Algorithms With Expert Demonstrations
2018 Β· Xiaoqin Zhang, Huimin Ma
Abstract
Pretraining with expert demonstrations have been found useful in speeding up the training process of deep reinforcement learning algorithms since less online simulation data is required. Some people use supervised learning to speed up the process of feature learning, others pretrain the policies by imitating expert demonstrations. However, these methods are unstable and not suitable for actor-critic reinforcement learning algorithms. Also, some existing methods rely on the global optimum assumption, which is not true in most scenarios. In this paper, we employ expert demonstrations in a actor-critic reinforcement learning framework, and meanwhile ensure that the performance is not affected by the fact that expert demonstrations are not global optimal. We theoretically derive a method for computing policy gradients and value estimators with only expert demonstrations. Our method is theoretically plausible for actor-critic reinforcement learning algorithms that pretrains both policy and
Authors
(none)
Tags
Stats
Related papers
- Policy Gradient From Demonstration And Curiosity (2020)0.00
- Monte Carlo Augmented Actor-critic For Sparse Reward Deep Reinforcement Learning From Suboptimal Demonstrations (2022)0.00
- Efficient Reinforcement Learning From Demonstration Using Local Ensemble And Reparameterization With Split And Merge Of Expert Policies (2022)0.00
- Efficient Performance Bounds For Primal-dual Reinforcement Learning From Demonstrations (2021)0.00
- Learning From Demonstrations With SACR2: Soft Actor-critic With Reward Relabeling (2021)0.00
- Learning Safe Policies With Expert Guidance (2018)0.00
- Interactive Reinforcement Learning With Dynamic Reuse Of Prior Knowledge From Human/agent's Demonstration (2018)8.60
- Cooperative Multi-agent Policy Gradients With Sub-optimal Demonstration (2018)0.00