QLLM: Do We Really Need A Mixing Network For Credit Assignment In Multi-agent Reinforcement Learning?
2025 Β· Yuanjun Li, Zhouyang Jiang, Bin Zhang, et al.
Abstract
Credit assignment remains a fundamental challenge in multi agent reinforcement learning (MARL) and is commonly addressed through value decomposition under the centralized training with decentralized ex ecution (CTDE) paradigm. However, existing value decomposition meth ods typically rely on predefined mixing networks that require additional training, often leading to imprecise credit attribution and limited in terpretability. We propose QLLM, a novel framework that leverages large language models (LLMs) to construct training-free credit assign ment functions (TFCAFs), where the TFCAFs are nonlinear with re spect to the global state and offer enhanced interpretability while intro ducing no extra learnable parameters. A coder-evaluator framework is employed to ensure the correctness and executability of the generated code. Extensive experiments on standard MARL benchmarks demon strate that QLLM consistently outperforms baselines while requiring fewer learnable parameters. Furthermore, it
Authors
(none)
Tags
Stats
Related papers
- Cooperative Multi-agent Transfer Learning With Level-adaptive Credit Assignment (2021)0.00
- Shapley Counterfactual Credits For Multi-agent Reinforcement Learning (2021)12.40
- Credit Assignment With Meta-policy Gradient For Multi-agent Reinforcement Learning (2021)0.00
- Locality Matters: A Scalable Value Decomposition Approach For Cooperative Multi-agent Reinforcement Learning (2021)0.00
- Asynchronous Credit Assignment For Multi-agent Reinforcement Learning (2024)0.00
- Nucleolus Credit Assignment For Effective Coalitions In Multi-agent Reinforcement Learning (2025)0.00
- MACCA: Offline Multi-agent Reinforcement Learning With Causal Credit Assignment (2023)0.00
- MARSHAL: Incentivizing Multi-agent Reasoning Via Self-play With Strategic Llms (2025)0.00