Rethinking The Foundations For Continual Reinforcement Learning
2025 Β· Esraa Elelimy, David Szepesvari, Martha White, et al.
Abstract
In the traditional view of reinforcement learning, the agent's goal is to find an optimal policy that maximizes its expected sum of rewards. Once the agent finds this policy, the learning ends. This view contrasts with *continual reinforcement learning*, where learning does not end, and agents are expected to continually learn and adapt indefinitely. Despite the clear distinction between these two paradigms of learning, much of the progress in continual reinforcement learning has been shaped by foundations rooted in the traditional view of reinforcement learning. In this paper, we first examine whether the foundations of traditional reinforcement learning are suitable for the continual reinforcement learning paradigm. We identify four key pillars of the traditional reinforcement learning foundations that are antithetical to the goals of continual learning: the Markov decision process formalism, the focus on atemporal artifacts, the expected sum of rewards as an evaluation metric, and e
Authors
(none)
Tags
Stats
Related papers
- A Definition Of Continual Reinforcement Learning (2023)7.50
- Towards Continual Reinforcement Learning: A Review And Perspectives (2020)0.00
- Advancements And Challenges In Continual Reinforcement Learning: A Comprehensive Review (2025)0.00
- Ergodic Risk Measures: Towards A Risk-aware Foundation For Continual Reinforcement Learning (2025)0.00
- A Survey Of Continual Reinforcement Learning (2025)0.00
- Position: Lifetime Tuning Is Incompatible With Continual Reinforcement Learning (2024)0.00
- Continual Learning As Computationally Constrained Reinforcement Learning (2023)0.00
- Continual World: A Robotic Benchmark For Continual Reinforcement Learning (2021)0.00