Himac: Hierarchical Macro-micro Learning For Long-horizon LLM Agents
2026 Β· Hongbo Jin, Rongpeng Zhu, Jiayu Ding, et al.
Abstract
arXiv:2603.00977v2 Announce Type: replace Abstract: Large language model (LLM) agents have recently demonstrated strong capabilities in interactive decision-making, yet they remain fundamentally limited in long-horizon tasks that require structured planning and reliable execution. Existing approaches predominantly rely on flat autoregressive policies, where high-level reasoning and low-level actions are generated within a single token sequence, leading to inefficient exploration and severe error propagation over extended trajectories. In this work, we propose HiMAC, a hierarchical agentic RL framework that explicitly decomposes long-horizon decision-making into macro-level planning and micro-level execution. HiMAC models reasoning as a structured blueprint generation process followed by goal-conditioned action execution, enabling robust long-horizon planning within LLM-based agents. To train this hierarchy efficiently, we introduce a critic-free hierarchical policy optimization paradi
Authors
(none)
Tags
Stats
Related papers
- End-to-end Optimization Of Llm-driven Multi-agent Search Systems Via Heterogeneous-group-based Reinforcement Learning (2025)0.00
- Guiding Multi-agent Multi-task Reinforcement Learning By A Hierarchical Framework With Logical Reward Shaping (2024)0.00
- Hierarchy Through Composition With Linearly Solvable Markov Decision Processes (2016)0.00
- DLM: Unified Decision Language Models For Offline Multi-agent Sequential Decision Making (2026)0.00
- SAC-GLAM: Improving Online RL For LLM Agents With Soft Actor-critic And Hindsight Relabeling (2024)0.00
- Klong: Training LLM Agent For Extremely Long-horizon Tasks (2026)0.00
- Hierarchical Deep Multiagent Reinforcement Learning With Temporal Abstraction (2018)0.00
- Macro-action-based Multi-agent/robot Deep Reinforcement Learning Under Partial Observability (2022)5.84