Reinforcement Learning Enhanced Online Adaptive Clinical Decision Support Via Digital Twin Powered Policy And Treatment Effect Optimized Reward
2025 Β· Xinyu Qin, Ruiheng Yu, Lu Wang
Abstract
Clinical decision support must adapt online under safety constraints. We present an online adaptive tool where reinforcement learning provides the policy, a patient digital twin provides the environment, and treatment effect defines the reward. The system initializes a batch-constrained policy from retrospective data and then runs a streaming loop that selects actions, checks safety, and queries experts only when uncertainty is high. Uncertainty comes from a compact ensemble of five Q-networks via the coefficient of variation of action values with a \(\tanh\) compression. The digital twin updates the patient state with a bounded residual rule. The outcome model estimates immediate clinical effect, and the reward is the treatment effect relative to a conservative reference with a fixed z-score normalization from the training split. Online updates operate on recent data with short runs and exponential moving averages. A rule-based safety gate enforces vital ranges and contraindications b
Authors
(none)
Tags
Stats
Related papers
- Clinician-in-the-loop Decision Making: Reinforcement Learning With Near-optimal Set-valued Policies (2020)0.00
- Reinforcement Learning In Dynamic Treatment Regimes Needs Critical Reexamination (2024)2.35
- POLAR: A Pessimistic Model-based Policy Learning Algorithm For Dynamic Treatment Regimes (2025)0.00
- Semi-supervised Off Policy Reinforcement Learning (2020)0.00
- Deep Reinforcement Learning For Clinical Decision Support: A Brief Survey (2019)0.00
- Federated Offline Reinforcement Learning (2022)0.00
- Did We Personalize? Assessing Personalization By An Online Reinforcement Learning Algorithm Using Resampling (2023)4.52
- Online Matching Via Reinforcement Learning: An Expert Policy Orchestration Strategy (2025)0.00