A Compression-inspired Framework For Macro Discovery
2017 Β· Francisco M. Garcia, Bruno C. da Silva, Philip S. Thomas
Abstract
In this paper we consider the problem of how a reinforcement learning agent tasked with solving a set of related Markov decision processes can use knowledge acquired early in its lifetime to improve its ability to more rapidly solve novel, but related, tasks. One way of exploiting this experience is by identifying recurrent patterns in trajectories obtained from well-performing policies. We propose a three-step framework in which an agent 1) generates a set of candidate open-loop macros by compressing trajectories drawn from near-optimal policies; 2) evaluates the value of each macro; and 3) selects a maximally diverse subset of macros that spans the space of policies typically required for solving the set of related tasks. Our experiments show that extending the original primitive action-set of the agent with the identified macros allows it to more rapidly learn an optimal policy in unseen, but similar MDPs.
Authors
(none)
Tags
Stats
Related papers
- Macro-action-based Deep Multi-agent Reinforcement Learning (2020)0.00
- Reusability And Transferability Of Macro Actions For Reinforcement Learning (2019)0.00
- Hierarchical Meta-reinforcement Learning Via Automated Macro-action Discovery (2024)0.00
- Macro-action-based Multi-agent/robot Deep Reinforcement Learning Under Partial Observability (2022)5.84
- Himac: Hierarchical Macro-micro Learning For Long-horizon LLM Agents (2026)0.00
- MACRPO: Multi-agent Cooperative Recurrent Policy Optimization (2021)0.00
- Hierarchy Through Composition With Linearly Solvable Markov Decision Processes (2016)0.00
- A Structured Prediction Approach For Generalization In Cooperative Multi-agent Reinforcement Learning (2019)0.00