Adaptive Probabilistic Trajectory Optimization Via Efficient Approximate Inference
2016 Β· Yunpeng Pan, Xinyan Yan, Evangelos Theodorou, et al.
Abstract
Robotic systems must be able to quickly and robustly make decisions when operating in uncertain and dynamic environments. While Reinforcement Learning (RL) can be used to compute optimal policies with little prior knowledge about the environment, it suffers from slow convergence. An alternative approach is Model Predictive Control (MPC), which optimizes policies quickly, but also requires accurate models of the system dynamics and environment. In this paper we propose a new approach, adaptive probabilistic trajectory optimization, that combines the benefits of RL and MPC. Our method uses scalable approximate inference to learn and updates probabilistic models in an online incremental fashion while also computing optimal control policies via successive local approximations. We present two variations of our algorithm based on the Sparse Spectrum Gaussian Process (SSGP) model, and we test our algorithm on three learning tasks, demonstrating the effectiveness and efficiency of our approach
Authors
(none)
Tags
Stats
Related papers
- Towards An Adaptable And Generalizable Optimization Engine In Decision And Control: A Meta Reinforcement Learning Approach (2024)0.00
- Blending MPC & Value Function Approximation For Efficient Reinforcement Learning (2020)0.00
- Improved Exploration Through Latent Trajectory Optimization In Deep Deterministic Policy Gradient (2019)0.00
- Adaptive Dynamic Programming For Model-free Tracking Of Trajectories With Time-varying Parameters (2019)9.59
- Model Predictive Control And Reinforcement Learning: A Unified Framework Based On Dynamic Programming (2024)10.61
- Uncertainty-aware Policy Optimization: A Robust, Adaptive Trust Region Approach (2020)0.00
- Deep Gaussian Covariance Network With Trajectory Sampling For Data-efficient Policy Search (2024)0.00
- Reinforcement Learning For Robotics And Control With Active Uncertainty Reduction (2019)0.00