Metatrace Actor-critic: Online Step-size Tuning By Meta-gradient Descent For Reinforcement Learning Control
2018 Β· Kenny Young, Baoxiang Wang, Matthew E. Taylor
Abstract
Reinforcement learning (RL) has had many successes in both "deep" and "shallow" settings. In both cases, significant hyperparameter tuning is often required to achieve good performance. Furthermore, when nonlinear function approximation is used, non-stationarity in the state representation can lead to learning instability. A variety of techniques exist to combat this --- most notably large experience replay buffers or the use of multiple parallel actors. These techniques come at the cost of moving away from the online RL problem as it is traditionally formulated (i.e., a single agent learning online without maintaining a large database of training examples). Meta-learning can potentially help with both these issues by tuning hyperparameters online and allowing the algorithm to more robustly adjust to non-stationarity in a problem. This paper applies meta-gradient descent to derive a set of step-size tuning algorithms specifically for online RL control with eligibility traces. Our novel
Authors
(none)
Tags
Stats
Related papers
- Meta-gradient Reinforcement Learning With An Objective Discovered Online (2020)0.00
- One Step At A Time: Pros And Cons Of Multi-step Meta-gradient Reinforcement Learning (2021)0.00
- A Self-tuning Actor-critic Algorithm (2020)0.00
- Meta Sac-lag: Towards Deployable Safe Reinforcement Learning Via Metagradient-based Hyperparameter Tuning (2024)2.26
- Stepsize Learning For Policy Gradient Methods In Contextual Markov Decision Processes (2023)2.26
- Hyperparameter Tuning For Deep Reinforcement Learning Applications (2022)0.00
- Meta-reinforcement Learning For The Tuning Of PI Controllers: An Offline Approach (2022)12.02
- Debiasing Meta-gradient Reinforcement Learning By Learning The Outer Value Function (2022)0.00