Survival Dynamics Of Neural And Programmatic Policies In Evolutionary Reinforcement Learning
2026 Β· Anton Roupassov-Ruiz, Yiyang Zuo
Abstract
In evolutionary reinforcement learning tasks (ERL), agent policies are often encoded as small artificial neural networks (NERL). Such representations lack explicit modular structure, limiting behavioral interpretation. We investigate whether programmatic policies (PERL), implemented as soft, differentiable decision lists (SDDL), can match the performance of NERL. To support reproducible evaluation, we provide the first fully specified and open-source reimplementation of the classic 1992 Artificial Life (ALife) ERL testbed. We conduct a rigorous survival analysis across 4000 independent trials utilizing Kaplan-Meier curves and Restricted Mean Survival Time (RMST) metrics absent in the original study. We find a statistically significant difference in survival probability between PERL and NERL. PERL agents survive on average 201.69 steps longer than NERL agents. Moreover, SDDL agents using learning alone (no evolution) survive on average 73.67 steps longer than neural agents using both le
Authors
(none)
Tags
Stats
Related papers
- Survival Of The Fittest: Evolutionary Adaptation Of Policies For Environmental Shifts (2024)2.26
- Evolution-guided Policy Gradient In Reinforcement Learning (2018)0.00
- Erl-re\(^2\): Efficient Evolutionary Reinforcement Learning With Shared State Representation And Individual Policy Representation (2022)0.00
- Collaborative Evolutionary Reinforcement Learning (2019)0.00
- Human-readable Programs As Actors Of Reinforcement Learning Agents Using Critic-moderated Evolution (2024)0.00
- Policyevolve: Evolving Programmatic Policies By Llms For Multi-player Games Via Population-based Training (2025)0.00
- Multimodal Llm-assisted Evolutionary Search For Programmatic Control Policies (2025)0.00
- Surrogate-assisted Evolutionary Reinforcement Learning Based On Autoencoder And Hyperbolic Neural Network (2025)0.00