Reactive Exploration To Cope With Non-stationarity In Lifelong Reinforcement Learning
2022 Β· Christian Steinparz, Thomas Schmied, Fabian Paischer, et al.
Abstract
In lifelong learning, an agent learns throughout its entire life without resets, in a constantly changing environment, as we humans do. Consequently, lifelong learning comes with a plethora of research problems such as continual domain shifts, which result in non-stationary rewards and environment dynamics. These non-stationarities are difficult to detect and cope with due to their continuous nature. Therefore, exploration strategies and learning methods are required that are capable of tracking the steady domain shifts, and adapting to them. We propose Reactive Exploration to track and react to continual domain shifts in lifelong reinforcement learning, and to update the policy correspondingly. To this end, we conduct experiments in order to investigate different exploration strategies. We empirically show that representatives of the policy-gradient family are better suited for lifelong learning, as they adapt more quickly to distribution shifts than Q-learning. Thereby, policy-gradie
Authors
(none)
Tags
Stats
Related papers
- Learning Adaptive Exploration Strategies In Dynamic Environments Through Informed Policy Regularization (2020)0.00
- Demystifying Reinforcement Learning In Time-varying Systems (2022)0.00
- L2explorer: A Lifelong Reinforcement Learning Assessment Environment (2022)0.00
- Continuous Coordination As A Realistic Scenario For Lifelong Learning (2021)0.00
- Online Reinforcement Learning In Non-stationary Context-driven Environments (2023)0.00
- Curious Explorer: A Provable Exploration Strategy In Policy Learning (2021)0.00
- Some Insights Into Lifelong Reinforcement Learning Systems (2020)0.00
- A Behavior-aware Approach For Deep Reinforcement Learning In Non-stationary Environments Without Known Change Points (2024)0.00