Faster Deep Reinforcement Learning With Slower Online Network
2021 Β· Kavosh Asadi, Rasool Fakoor, Omer Gottesman, et al.
Abstract
Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping. In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network. This improves the robustness of deep reinforcement learning in presence of noisy updates. The resultant agents, called DQN Pro and Rainbow Pro, exhibit significant performance improvements over their original counterparts on the Atari benchmark demonstrating the effectiveness of this simple idea in deep reinforcement learning. The code for our paper is available here: Github.com/amazon-research/fast-rl-with-slow-updates.
Authors
(none)
Tags
Stats
Related papers
- Revisiting Rainbow: Promoting More Insightful And Inclusive Deep Reinforcement Learning Research (2020)0.00
- Deep Q-networks For Accelerating The Training Of Deep Neural Networks (2016)0.00
- T-soft Update Of Target Network For Deep Reinforcement Learning (2020)13.39
- Accelerated Methods For Deep Reinforcement Learning (2018)0.00
- Adaptive \(q\)-network: On-the-fly Target Selection For Deep Reinforcement Learning (2024)0.00
- Boosting Reinforcement Learning With Strongly Delayed Feedback Through Auxiliary Short Delays (2024)1.69
- Learning Fast Changing Slow In Spiking Neural Networks (2024)3.58
- Importance Of Using Appropriate Baselines For Evaluation Of Data-efficiency In Deep Reinforcement Learning For Atari (2020)0.00