Healthcare AI GYM For Medical Agents
2026 Β· Minbyul Jeong
Abstract
arXiv:2605.02943v1 Announce Type: cross Abstract: Clinical reasoning demands multi-step interactions -- gathering patient history, ordering tests, interpreting results, and making safe treatment decisions -- yet a unified training environment provides the breadth of clinical domains and specialized tools to train generalizable medical AI agents through reinforcement learning remains elusive. We present a comprehensive empirical study of multi-turn agentic RL for medical AI, built on \gym\{\}, a gymnasium-compatible environment spanning 10 clinical domains with 3.6K+ tasks, 135 domain-specific tools, and a knowledge base of 828K medical passages. Our analysis reveals that agentic multi-turn structure degrades into verbose single-turn monologues, characterized by monotonic length explosion and a simultaneous erosion of tool-use frequency. We characterize how this collapse, alongside distillation instability, stems from the misalignment of sparse terminal rewards with sequential clinical
Authors
(none)
Tags
Stats
Related papers
- Adaptive Multi-agent Deep Reinforcement Learning For Timely Healthcare Interventions (2023)0.00
- The AI Arena: A Framework For Distributed Multi-agent Reinforcement Learning (2021)0.00
- Userrl: Training Interactive User-centric Agent Via Reinforcement Learning (2025)0.00
- Cure-med: Curriculum-informed Reinforcement Learning For Multilingual Medical Reasoning (2026)0.00
- An Empirical Study Of Representation Learning For Reinforcement Learning In Healthcare (2020)0.00
- Computerrl: Scaling End-to-end Online Reinforcement Learning For Computer Use Agents (2025)0.00
- Genai-based Multi-agent Reinforcement Learning Towards Distributed Agent Intelligence: A Generative-rl Agent Perspective (2025)0.00
- Efficient Multi-turn RL For GUI Agents Via Decoupled Training And Adaptive Data Curation (2025)0.00