Abstract

Developing agents that can perform challenging complex tasks is the goal of reinforcement learning. The model-free reinforcement learning has been considered as a feasible solution. However, the state of the art research has been to develop increasingly complicated techniques. This increasing complexity makes the reconstruction difficult. Furthermore, the problem of reward dependency is still exists. As a result, research on imitation learning, which learns policy from a demonstration of experts, has begun to attract attention. Imitation learning directly learns policy based on data on the behavior of the experts without the explicit reward signal provided by the environment. However, imitation learning tries to optimize policies based on deep reinforcement learning such as trust region policy optimization. As a result, deep reinforcement learning based imitation learning also poses a crisis of reproducibility. The issue of complex model-free model has received considerable critical at

Authors

(none)

Tags

  • Uncategorized

Stats

Related papers