Abstract

Without exact knowledge of the true system dynamics, optimal control of non-linear continuous-time systems requires careful treatment under epistemic uncertainty. In this work, we translate a probabilistic interpretation of the Pontryagin maximum principle to the challenge of optimal control with learned probabilistic dynamics models. Our framework provides a principled treatment of epistemic uncertainty by minimizing the mean Hamiltonian with respect to a posterior distribution over the system dynamics. We propose a multiple shooting numerical method that leverages mean Hamiltonian minimization and is scalable to large-scale probabilistic dynamics models, including ensemble neural ordinary differential equations. Comparisons against other baselines in online and offline model-based reinforcement learning tasks show that our probabilistic Hamiltonian approach leads to reduced trial costs in offline settings and achieves competitive performance in online scenarios. By bridging optimal c

Authors

(none)

Tags

  • Model-Based RL

Stats

Related papers