Elastic Step DQN: A Novel Multi-step Algorithm To Alleviate Overestimation In Deep Qnetworks
2022 Β· Adrian Ly, Richard Dazeley, Peter Vamplew, et al.
Abstract
Deep Q-Networks algorithm (DQN) was the first reinforcement learning algorithm using deep neural network to successfully surpass human level performance in a number of Atari learning environments. However, divergent and unstable behaviour have been long standing issues in DQNs. The unstable behaviour is often characterised by overestimation in the \(Q\)-values, commonly referred to as the overestimation bias. To address the overestimation bias and the divergent behaviour, a number of heuristic extensions have been proposed. Notably, multi-step updates have been shown to drastically reduce unstable behaviour while improving agent's training performance. However, agents are often highly sensitive to the selection of the multi-step update horizon (\(n\)), and our empirical experiments show that a poorly chosen static value for \(n\) can in many cases lead to worse performance than single-step DQN. Inspired by the success of \(n\)-step DQN and the effects that multi-step updates have on ov
Authors
(none)
Tags
Stats
Related papers
- Understanding Multi-step Deep Reinforcement Learning: A Systematic Study Of The DQN Target (2019)0.00
- Iterated \(q\)-network: Beyond One-step Bellman Updates In Deep Reinforcement Learning (2024)0.00
- Convergent And Efficient Deep Q Network Algorithm (2021)0.00
- Modified Double DQN: Addressing Stability (2021)0.00
- DQN With Model-based Exploration: Efficient Learning On Environments With Sparse Rewards (2019)0.00
- Weighted Double Deep Multiagent Reinforcement Learning In Stochastic Cooperative Environments (2018)0.00
- Sampling Efficient Deep Reinforcement Learning Through Preference-guided Stochastic Exploration (2022)8.09
- A Theoretical Analysis Of Deep Q-learning (2019)0.00