Policyevolve: Evolving Programmatic Policies By Llms For Multi-player Games Via Population-based Training
2025 Β· Mingrui Lv, Hangzhi Liu, Zhi Luo, et al.
Abstract
Multi-agent reinforcement learning (MARL) has achieved significant progress in solving complex multi-player games through self-play. However, training effective adversarial policies requires millions of experience samples and substantial computational resources. Moreover, these policies lack interpretability, hindering their practical deployment. Recently, researchers have successfully leveraged Large Language Models (LLMs) to generate programmatic policies for single-agent tasks, transforming neural network-based policies into interpretable rule-based code with high execution efficiency. Inspired by this, we propose PolicyEvolve, a general framework for generating programmatic policies in multi-player games. PolicyEvolve significantly reduces reliance on manually crafted policy code, achieving high-performance policies with minimal environmental interactions. The framework comprises four modules: Global Pool, Local Pool, Policy Planner, and Trajectory Critic. The Global Pool preserves
Authors
(none)
Tags
Stats
Related papers
- Multimodal Llm-assisted Evolutionary Search For Programmatic Control Policies (2025)0.00
- Agent-pro: Learning To Evolve Via Policy-level Reflection And Optimization (2024)9.59
- Evolution Of Societies Via Reinforcement Learning (2024)0.00
- Discovering Multiagent Learning Algorithms With Large Language Models (2026)2.05
- Generative Evolutionary Meta-solver (GEMS): Scalable Surrogate-free Multi-agent Reinforcement Learning (2025)0.00
- End-to-end Optimization Of Llm-driven Multi-agent Search Systems Via Heterogeneous-group-based Reinforcement Learning (2025)0.00
- Evolutionary Population Curriculum For Scaling Multi-agent Reinforcement Learning (2020)0.00
- LERO: Llm-driven Evolutionary Framework With Hybrid Rewards And Enhanced Observation For Multi-agent Reinforcement Learning (2025)3.58