CARL: Focusing Agentic Reinforcement Learning On Critical Actions
2025 Β· Leyang Shen, Yang Zhang, Chun Kai Ling, et al.
Abstract
Agents capable of accomplishing complex tasks through multiple interactions with the environment have emerged as a popular research direction. However, in such multi-step settings, the conventional group-level policy optimization algorithm becomes suboptimal because of its underlying assumption that each action holds equal contribution, which deviates significantly from reality. Our analysis reveals that only a small fraction of actions are critical in determining the final outcome. Building on this insight, we propose CARL, a critical-action-focused reinforcement learning algorithm tailored for long-horizon agentic reasoning. CARL leverages entropy as a heuristic proxy for action criticality and achieves focused training by assigning rewards to high-criticality actions while excluding low-criticality actions from model updates, avoiding noisy credit assignment and redundant computation. Extensive experiments demonstrate that CARL achieves both stronger performance and higher efficienc
Authors
(none)
Tags
Stats
Related papers
- ACE : Off-policy Actor-critic With Causality-aware Entropy Regularization (2024)0.00
- Attention Actor-critic Algorithm For Multi-agent Constrained Co-operative Reinforcement Learning (2021)0.00
- Context-aware Bayesian Network Actor-critic Methods For Cooperative Multi-agent Reinforcement Learning (2023)0.00
- Local Advantage Actor-critic For Robust Multi-agent Deep Reinforcement Learning (2021)7.81
- Actor-attention-critic For Multi-agent Reinforcement Learning (2018)0.00
- Actor-critic Policy Optimization In Partially Observable Multiagent Environments (2018)0.00
- CAMMARL: Conformal Action Modeling In Multi Agent Reinforcement Learning (2023)0.00
- Learning To Coordinate In Multi-agent Systems: A Coordinated Actor-critic Algorithm And Finite-time Guarantees (2021)0.00