CURO: Curriculum Learning For Relative Overgeneralization
2022 Β· Lin Shi, Qiyuan Liu, Bei Peng
Abstract
Relative overgeneralization (RO) is a pathology that can arise in cooperative multi-agent tasks when the optimal joint action's utility falls below that of a sub-optimal joint action. RO can cause the agents to get stuck into local optima or fail to solve cooperative tasks requiring significant coordination between agents within a given timestep. In this work, we empirically find that, in multi-agent reinforcement learning (MARL), both value-based and policy gradient MARL algorithms can suffer from RO and fail to learn effective coordination policies. To better overcome RO, we propose a novel approach called curriculum learning for relative overgeneralization (CURO). To solve a target task that exhibits strong RO, in CURO, we first fine-tune the reward function of the target task to generate source tasks to train the agent. Then, to effectively transfer the knowledge acquired in one task to the next, we use a transfer learning method that combines value function transfer with buffer tr
Authors
(none)
Tags
Stats
Related papers
- Mitigating Relative Over-generalization In Multi-agent Reinforcement Learning (2024)0.00
- Curriculum Learning With Counterfactual Group Relative Policy Advantage For Multi-agent Reinforcement Learning (2025)0.00
- Optimistic Multi-agent Policy Gradient (2023)0.00
- Negotiated Reasoning: On Provably Addressing Relative Over-generalization (2023)0.00
- Proximal Curriculum With Task Correlations For Deep Reinforcement Learning (2024)0.00
- On Generalization Across Environments In Multi-objective Reinforcement Learning (2025)0.00
- Curriculum Learning For Cooperation In Multi-agent Reinforcement Learning (2023)0.00
- Gradient Coupling: The Hidden Barrier To Generalization In Agentic Reinforcement Learning (2025)0.00