Modified Double DQN: Addressing Stability
2021 Β· Shervin Halat, Mohammad Mehdi Ebadzadeh, Kiana Amani
Abstract
Inspired by Double Q-learning algorithm, the Double-DQN (DDQN) algorithm was originally proposed in order to address the overestimation issue in the original DQN algorithm. The DDQN has successfully shown both theoretically and empirically the importance of decoupling in terms of action evaluation and selection in computation of target values; although, all the benefits were acquired with only a simple adaption to DQN algorithm, minimal possible change as it was mentioned by the authors. Nevertheless, there seems a roll-back in the proposed algorithm of DDQN since the parameters of policy network are emerged again in the target value function which were initially withdrawn by DQN with the hope of tackling the serious issue of moving targets and the instability caused by it (i.e., by moving targets) in the process of learning. Therefore, in this paper three modifications to the DDQN algorithm are proposed with the hope of maintaining the performance in the terms of both stability and ov
Authors
(none)
Tags
Stats
Related papers
- Elastic Step DQN: A Novel Multi-step Algorithm To Alleviate Overestimation In Deep Qnetworks (2022)10.85
- Averaged-dqn: Variance Reduction And Stabilization For Deep Reinforcement Learning (2016)0.00
- On The Estimation Bias In Double Q-learning (2021)0.00
- M\(^2\)DQN: A Robust Method For Accelerating Deep Q-learning Network (2022)0.00
- Stabilizing Q-learning With Linear Architectures For Provably Efficient Learning (2022)0.00
- Finite-time Analysis For Double Q-learning (2020)0.00
- Weighted Double Deep Multiagent Reinforcement Learning In Stochastic Cooperative Environments (2018)0.00
- Finite-time Analysis Of Simultaneous Double Q-learning (2024)0.00