Multimodal Llm-assisted Evolutionary Search For Programmatic Control Policies
2025 Β· Qinglong Hu, Xialiang Tong, Mingxuan Yuan, et al.
Abstract
Deep reinforcement learning has achieved impressive success in control tasks. However, its policies, represented as opaque neural networks, are often difficult for humans to understand, verify, and debug, which undermines trust and hinders real-world deployment. This work addresses this challenge by introducing a novel approach for programmatic control policy discovery, called Multimodal Large Language Model-assisted Evolutionary Search (MLES). MLES utilizes multimodal large language models as programmatic policy generators, combining them with evolutionary search to automate policy generation. It integrates visual feedback-driven behavior analysis within the policy generation process to identify failure patterns and guide targeted improvements, thereby enhancing policy discovery efficiency and producing adaptable, human-aligned policies. Experimental results demonstrate that MLES achieves performance comparable to Proximal Policy Optimization (PPO) across two standard control tasks wh
Authors
(none)
Tags
Stats
Related papers
- Policyevolve: Evolving Programmatic Policies By Llms For Multi-player Games Via Population-based Training (2025)0.00
- End-to-end Optimization Of Llm-driven Multi-agent Search Systems Via Heterogeneous-group-based Reinforcement Learning (2025)0.00
- Human-readable Programs As Actors Of Reinforcement Learning Agents Using Critic-moderated Evolution (2024)0.00
- Collaborative Evolutionary Reinforcement Learning (2019)0.00
- LERO: Llm-driven Evolutionary Framework With Hybrid Rewards And Enhanced Observation For Multi-agent Reinforcement Learning (2025)3.58
- Agent-pro: Learning To Evolve Via Policy-level Reflection And Optimization (2024)9.59
- Interpretability By Design For Efficient Multi-objective Reinforcement Learning (2025)0.00
- CEM-RL: Combining Evolutionary And Gradient-based Methods For Policy Search (2018)0.00