Robust Reinforcement Learning Under Diffusion Models For Data With Jumps
2024 Β· Chenyang Jiang, Donggyu Kim, Alejandra Quintos, et al.
Abstract
Reinforcement Learning (RL) has proven effective in solving complex decision-making tasks across various domains, but challenges remain in continuous-time settings, particularly when state dynamics are governed by stochastic differential equations (SDEs) with jump components. In this paper, we address this challenge by introducing the Mean-Square Bipower Variation Error (MSBVE) algorithm, which enhances robustness and convergence in scenarios involving significant stochastic noise and jumps. We first revisit the Mean-Square TD Error (MSTDE) algorithm, commonly used in continuous-time RL, and highlight its limitations in handling jumps in state dynamics. The proposed MSBVE algorithm minimizes the mean-square quadratic variation error, offering improved performance over MSTDE in environments characterized by SDEs with jumps. Simulations and formal proofs demonstrate that the MSBVE algorithm reliably estimates the value function in complex settings, surpassing MSTDE's performance when fac
Authors
(none)
Tags
Stats
Related papers
- Continuous-time Risk-sensitive Reinforcement Learning Via Quadratic Variation Penalty (2024)0.00
- Understanding Sampler Stochasticity In Training Diffusion Models For RLHF (2025)0.00
- Distributionally Robust Model-based Reinforcement Learning With Large State Spaces (2023)0.00
- A Random Measure Approach To Reinforcement Learning In Continuous Time (2024)0.00
- Non-stationary Reinforcement Learning: The Blessing Of (more) Optimism (2019)0.00
- Efficient And Robust Reinforcement Learning With Uncertainty-based Value Expansion (2019)0.00
- Robust Bayesian Dynamic Programming For On-policy Risk-sensitive Reinforcement Learning (2025)0.00
- Reinforcement Learning With Non-ergodic Reward Increments: Robustness Via Ergodicity Transformations (2023)0.00