Understanding Multi-step Deep Reinforcement Learning: A Systematic Study Of The DQN Target
2019 Β· J. Fernando Hernandez-Garcia, Richard S. Sutton
Abstract
Multi-step methods such as Retrace(\(\lambda\)) and \(n\)-step \(Q\)-learning have become a crucial component of modern deep reinforcement learning agents. These methods are often evaluated as a part of bigger architectures and their evaluations rarely include enough samples to draw statistically significant conclusions about their performance. This type of methodology makes it difficult to understand how particular algorithmic details of multi-step methods influence learning. In this paper we combine the \(n\)-step action-value algorithms Retrace, \(Q\)-learning, Tree Backup, Sarsa, and \(Q(\sigma)\) with an architecture analogous to DQN. We test the performance of all these algorithms in the mountain car environment; this choice of environment allows for faster training times and larger sample sizes. We present statistical analyses on the effects of the off-policy correction, the backup length parameter \(n\), and the update frequency of the target network on the performance of these
Authors
(none)
Tags
Stats
Related papers
- Elastic Step DQN: A Novel Multi-step Algorithm To Alleviate Overestimation In Deep Qnetworks (2022)10.85
- Multi-step Reinforcement Learning: A Unifying Algorithm (2017)12.68
- A Unified Approach For Multi-step Temporal-difference Learning With Eligibility Traces In Reinforcement Learning (2018)6.77
- The Nature Of Temporal Difference Errors In Multi-step Distributional Reinforcement Learning (2022)0.00
- Modular Multi-objective Deep Reinforcement Learning With Decision Values (2017)10.74
- Long N-step Surrogate Stage Reward To Reduce Variances Of Deep Reinforcement Learning In Complex Problems (2022)0.00
- DQN With Model-based Exploration: Efficient Learning On Environments With Sparse Rewards (2019)0.00
- Generalization And Regularization In DQN (2018)0.00