Synthesizing World Models For Bilevel Planning
2025 Β· Zergham Ahmed, Joshua B. Tenenbaum, Christopher J. Bates, et al.
Abstract
Modern reinforcement learning (RL) systems have demonstrated remarkable capabilities in complex environments, such as video games. However, they still fall short of achieving human-like sample efficiency and adaptability when learning new domains. Theory-based reinforcement learning (TBRL) is an algorithmic framework specifically designed to address this gap. Modeled on cognitive theories, TBRL leverages structured, causal world models - "theories" - as forward simulators for use in planning, generalization and exploration. Although current TBRL systems provide compelling explanations of how humans learn to play video games, they face several technical limitations: their theory languages are restrictive, and their planning algorithms are not scalable. To address these challenges, we introduce TheoryCoder, an instantiation of TBRL that exploits hierarchical representations of theories and efficient program synthesis methods for more powerful learning and planning. TheoryCoder equips age
Authors
(none)
Tags
Stats
Related papers
- Human-level Reinforcement Learning Through Theory-based Modeling, Exploration, And Planning (2021)0.00
- Continual Reinforcement Learning By Planning With Online World Models (2025)0.00
- Exploring The Limits Of Hierarchical World Models In Reinforcement Learning (2024)6.34
- Harmonydream: Task Harmonization Inside World Models (2023)3.46
- Think In Games: Learning To Reason In Games Via Reinforcement Learning With Large Language Models (2025)0.00
- Bridging Imagination And Reality For Model-based Deep Reinforcement Learning (2020)0.00
- Fast Exploration With Simplified Models And Approximately Optimistic Planning In Model Based Reinforcement Learning (2018)0.00
- Blendrl: A Framework For Merging Symbolic And Neural Policy Learning (2024)0.00