Programmatic Reinforcement Learning: Navigating Gridworlds
2024 · Guruprerana Shabadi, Nathanaël Fijalkow, Théo Matricon
Abstract
The field of reinforcement learning (RL) is concerned with algorithms for learning optimal policies in unknown stochastic environments. Programmatic RL studies representations of policies as programs, meaning involving higher order constructs such as control loops. Despite attracting a lot of attention at the intersection of the machine learning and formal methods communities, very little is known on the theoretical front about programmatic RL: what are good classes of programmatic policies? How large are optimal programmatic policies? How can we learn them? The goal of this paper is to give first answers to these questions, initiating a theoretical study of programmatic RL. Considering a class of gridworld environments, we define a class of programmatic policies. Our main contributions are to place upper bounds on the size of optimal programmatic policies, and to construct an algorithm for synthesizing them. These theoretical findings are complemented by a prototype implementation of
Authors
(none)
Tags
Stats
Related papers
- Learning Of Generalizable And Interpretable Knowledge In Grid-based Reinforcement Learning Environments (2023)3.58
- Direct And Indirect Reinforcement Learning (2019)10.74
- Interpretable Policies For Reinforcement Learning By Genetic Programming (2017)14.76
- Human-readable Programs As Actors Of Reinforcement Learning Agents Using Critic-moderated Evolution (2024)0.00
- Discovering Reinforcement Learning Algorithms (2020)0.00
- PC-MLP: Model-based Reinforcement Learning With Policy Cover Guided Exploration (2021)0.00
- Policy Gradient RL Algorithms As Directed Acyclic Graphs (2020)0.00
- PWM: Policy Learning With Multi-task World Models (2024)0.00