Adam On Local Time: Addressing Nonstationarity In RL With Relative Adam Timesteps
2024 Β· Benjamin Ellis, Matthew T. Jackson, Andrei Lupu, et al.
Abstract
In reinforcement learning (RL), it is common to apply techniques used broadly in machine learning such as neural network function approximators and momentum-based optimizers. However, such tools were largely developed for supervised learning rather than nonstationary RL, leading practitioners to adopt target networks, clipped policy updates, and other RL-specific implementation tricks to combat this mismatch, rather than directly adapting this toolchain for use in RL. In this paper, we take a different approach and instead address the effect of nonstationarity by adapting the widely used Adam optimiser. We first analyse the impact of nonstationary gradient magnitude -- such as that caused by a change in target network -- on Adam's update size, demonstrating that such a change can lead to large updates and hence sub-optimal performance. To address this, we introduce Adam-Rel. Rather than using the global timestep in the Adam update, Adam-Rel uses the local timestep within an epoch, esse
Authors
(none)
Tags
Stats
Related papers
- Non-asymptotic Convergence Of Adam-type Reinforcement Learning Algorithms Under Markovian Sampling (2020)0.00
- Tempo Adaptation In Non-stationary Reinforcement Learning (2023)0.00
- Model-agnostic Solutions For Deep Reinforcement Learning In Non-ergodic Contexts (2026)0.00
- Demystifying Reinforcement Learning In Time-varying Systems (2022)0.00
- Transient Non-stationarity And Generalisation In Deep Reinforcement Learning (2020)0.00
- Dynamic Learning Rate For Deep Reinforcement Learning: A Bandit Approach (2024)0.00
- Moments Matter:stabilizing Policy Optimization Using Return Distributions (2026)0.00
- Regularizing A Model-based Policy Stationary Distribution To Stabilize Offline Reinforcement Learning (2022)0.00