Bayesian Inverse Transition Learning: Learning Dynamics From Near-optimal Trajectories
2026 Β· Leo Benac, Abhishek Sharma, Sonali Parbhoo, et al.
Abstract
arXiv:2411.05174v2 Announce Type: replace Abstract: We consider the problem of estimating the transition dynamics \(T^*\) from near-optimal expert trajectories in the context of offline model-based reinforcement learning. We develop a novel constraint-based method, Inverse Transition Learning, that treats the limited coverage of the expert trajectories as a *feature*: we use the fact that the expert is near-optimal to inform our estimate of \(T^*\). We integrate our constraints into a Bayesian approach. Across both synthetic environments and real healthcare scenarios like Intensive Care Unit (ICU) patient management in hypotension, we demonstrate not only significant improvements in decision-making, but that our posterior can inform when transfer will be successful.
Authors
(none)
Tags
Stats
Related papers
- Inverse Reinforcement Learning From Non-stationary Learning Agents (2024)0.00
- In-trajectory Inverse Reinforcement Learning: Learn Incrementally Before An Ongoing Trajectory Terminates (2024)5.24
- Model-based Reinforcement Learning For Control Under Time-varying Dynamics (2026)0.00
- Learning Multimodal Transition Dynamics For Model-based Reinforcement Learning (2017)0.00
- Reinforcement Learning With Trajectory Feedback (2020)0.00
- A Bayesian Approach To Robust Inverse Reinforcement Learning (2023)0.00
- Adaptive Probabilistic Trajectory Optimization Via Efficient Approximate Inference (2016)0.00
- Bitrajdiff: Bidirectional Trajectory Generation With Diffusion Models For Offline Reinforcement Learning (2025)0.00