Deep Gaussian Covariance Network With Trajectory Sampling For Data-efficient Policy Search
2024 Β· Can Bogoclu, Robert Vosshall, Kevin Cremanns, et al.
Abstract
Probabilistic world models increase data efficiency of model-based reinforcement learning (MBRL) by guiding the policy with their epistemic uncertainty to improve exploration and acquire new samples. Moreover, the uncertainty-aware learning procedures in probabilistic approaches lead to robust policies that are less sensitive to noisy observations compared to uncertainty unaware solutions. We propose to combine trajectory sampling and deep Gaussian covariance network (DGCN) for a data-efficient solution to MBRL problems in an optimal control setting. We compare trajectory sampling with density-based approximation for uncertainty propagation using three different probabilistic world models; Gaussian processes, Bayesian neural networks, and DGCNs. We provide empirical evidence using four different well-known test environments, that our method improves the sample-efficiency over other combinations of uncertainty propagation methods and probabilistic models. During our tests, we place part
Authors
(none)
Tags
Stats
Related papers
- Deep Reinforcement Learning In A Handful Of Trials Using Probabilistic Dynamics Models (2018)0.00
- Adaptive Probabilistic Trajectory Optimization Via Efficient Approximate Inference (2016)0.00
- Improved Exploration Through Latent Trajectory Optimization In Deep Deterministic Policy Gradient (2019)0.00
- Distributionally Robust Model-based Reinforcement Learning With Large State Spaces (2023)0.00
- Efficient Model-based Reinforcement Learning Through Optimistic Policy Search And Planning (2020)0.00
- Autoregressive Policies For Continuous Control Deep Reinforcement Learning (2019)7.50
- Variance Reduction Based Partial Trajectory Reuse To Accelerate Policy Gradient Optimization (2022)0.00
- World Models Via Policy-guided Trajectory Diffusion (2023)0.00