Natural Gradient Deep Q-learning
2018 Β· Ethan Knight, Osher Lerner
Abstract
We present a novel algorithm to train a deep Q-learning agent using natural-gradient techniques. We compare the original deep Q-network (DQN) algorithm to its natural-gradient counterpart, which we refer to as NGDQN, on a collection of classic control domains. Without employing target networks, NGDQN significantly outperforms DQN without target networks, and performs no worse than DQN with target networks, suggesting that NGDQN stabilizes training and can help reduce the need for additional hyperparameter tuning. We also find that NGDQN is less sensitive to hyperparameter optimization relative to DQN. Together these results suggest that natural-gradient techniques can improve value-function optimization in deep reinforcement learning.
Authors
(none)
Tags
Stats
Related papers
- GB-DQN: Gradient Boosted DQN Models For Non-stationary Reinforcement Learning (2025)0.00
- Quantum Natural Policy Gradients: Towards Sample-efficient Reinforcement Learning (2023)7.16
- Deep Q-networks For Accelerating The Training Of Deep Neural Networks (2016)0.00
- Efficient Wasserstein Natural Gradients For Reinforcement Learning (2020)0.00
- Deep Q-learning: A Robust Control Approach (2022)9.23
- Convergent And Efficient Deep Q Network Algorithm (2021)0.00
- Approximating Gradients For Differentiable Quality Diversity In Reinforcement Learning (2022)0.00
- Elastic Step DQN: A Novel Multi-step Algorithm To Alleviate Overestimation In Deep Qnetworks (2022)10.85