Reinforcement Learning With Partial Parametric Model Knowledge
2023 Β· Shuyuan Wang, Philip D. Loewen, Nathan P. Lawrence, et al.
Abstract
We adapt reinforcement learning (RL) methods for continuous control to bridge the gap between complete ignorance and perfect knowledge of the environment. Our method, Partial Knowledge Least Squares Policy Iteration (PLSPI), takes inspiration from both model-free RL and model-based control. It uses incomplete information from a partial model and retains RL's data-driven adaption towards optimal performance. The linear quadratic regulator provides a case study; numerical experiments demonstrate the effectiveness and resulting benefits of the proposed method.
Authors
(none)
Tags
Stats
Related papers
- PC-MLP: Model-based Reinforcement Learning With Policy Cover Guided Exploration (2021)0.00
- Sublinear Regret For A Class Of Continuous-time Linear-quadratic Reinforcement Learning Problems (2024)0.00
- Least-squares Temporal Difference Learning For The Linear Quadratic Regulator (2017)0.00
- Concurrent Learning Of Policy And Unknown Safety Constraints In Reinforcement Learning (2024)0.00
- Pid-inspired Inductive Biases For Deep Reinforcement Learning In Partially Observable Control Tasks (2023)0.00
- Deep RL With Information Constrained Policies: Generalization In Continuous Control (2020)0.00
- Learning Interpretable Policies In Hindsight-observable Pomdps Through Partially Supervised Reinforcement Learning (2024)2.26
- The Gap Between Model-based And Model-free Methods On The Linear Quadratic Regulator: An Asymptotic Viewpoint (2018)0.00