Mitigating Planner Overfitting In Model-based Reinforcement Learning
2018 Β· Dilip Arumugam, David Abel, Kavosh Asadi, et al.
Abstract
An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model. Alternatively, it can take a more conservative stance and eschew its model in favor of optimizing its behavior solely via real-world interaction. This latter approach can be exceedingly slow to learn from experience, while the former can lead to "planner overfitting" - aspects of the agent's behavior are optimized to exploit errors in its model. This paper explores an intermediate position in which the planner seeks to avoid overfitting through a kind of regularization of the plans it considers. We present three different approaches that demonstrably mitigate planner overfitting in reinforcement-learning environments.
Authors
(none)
Tags
Stats
Related papers
- Overestimation, Overfitting, And Plasticity In Actor-critic: The Bitter Lesson Of Reinforcement Learning (2024)0.00
- Coplanner: Plan To Roll Out Conservatively But To Explore Optimistically For Model-based RL (2023)0.00
- Observational Overfitting In Reinforcement Learning (2019)0.00
- Self-correcting Models For Model-based Reinforcement Learning (2016)0.00
- A Kl-regularization Framework For Learning To Plan With Adaptive Priors (2025)0.00
- Learning To Combat Compounding-error In Model-based Reinforcement Learning (2019)0.00
- Improving Generalization To New Environments And Removing Catastrophic Forgetting In Reinforcement Learning By Using An Eco-system Of Agents (2022)0.00
- Mind The Model, Not The Agent: The Primacy Bias In Model-based RL (2023)0.00