Model-based Reinforcement Learning For Semi-markov Decision Processes With Neural Odes
2020 Β· Jianzhun Du, Joseph Futoma, Finale Doshi-Velez
Abstract
We present two elegant solutions for modeling continuous-time dynamics, in a novel model-based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs), using neural ordinary differential equations (ODEs). Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data. We also develop a model-based approach for optimizing time schedules to reduce interaction rates with the environment while maintaining the near-optimal performance, which is not possible for model-free methods. We experimentally demonstrate the efficacy of our methods across various continuous-time domains.
Authors
(none)
Tags
Stats
Related papers
- Efficient Exploration In Continuous-time Model-based Reinforcement Learning (2023)0.00
- Dynode: Neural Ordinary Differential Equations For Dynamics Modeling In Continuous Control (2020)0.00
- Double Reinforcement Learning For Efficient Off-policy Evaluation In Markov Decision Processes (2019)0.00
- A General Markov Decision Process Framework For Directly Learning Optimal Control Policies (2019)0.00
- Policy Optimization For Continuous Reinforcement Learning (2023)2.26
- On Learning History Based Policies For Controlling Markov Decision Processes (2022)0.00
- Optimal Decision-making In Mixed-agent Partially Observable Stochastic Environments Via Reinforcement Learning (2019)0.00
- Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity (2021)0.00