Simple Recipe Works: Vision-language-action Models Are Natural Continual Learners With Reinforcement Learning
2026 Β· Jiaheng Hu, Jay Shim, Chen Tang, et al.
Abstract
Continual Reinforcement Learning (CRL) for Vision-Language-Action (VLA) models is a promising direction toward self-improving embodied agents that can adapt in openended, evolving environments. However, conventional wisdom from continual learning suggests that naive Sequential Fine-Tuning (Seq. FT) leads to catastrophic forgetting, necessitating complex CRL strategies. In this work, we take a step back and conduct a systematic study of CRL for large pretrained VLAs across three models and five challenging lifelong RL benchmarks. We find that, contrary to established belief, simple Seq. FT with low-rank adaptation (LoRA) is remarkably strong: it achieves high plasticity, exhibits little to no forgetting, and retains strong zero-shot generalization, frequently outperforming more sophisticated CRL methods. Through detailed analysis, we show that this robustness arises from a synergy between the large pretrained model, parameter-efficient adaptation, and on-policy RL. Together, these compo
Authors
(none)
Tags
Stats
Related papers
- Enhancing Vision-language Model Training With Reinforcement Learning In Synthetic Worlds For Real-world Success (2025)0.00
- RL Token: Bootstrapping Online RL With Vision-language-action Models (2026)0.00
- Continual Reinforcement Learning By Planning With Online World Models (2025)0.00
- Task-agnostic Continual Reinforcement Learning: Gaining Insights And Overcoming Challenges (2022)0.00
- Discovering Failure Modes In Vision-language Models Using RL (2026)0.00
- Continual Policy Distillation From Distributed Reinforcement Learning Teachers (2026)0.00
- Continual Visual Reinforcement Learning With A Life-long World Model (2023)2.26
- Continual Knowledge Adaptation For Reinforcement Learning (2025)0.00