Pid-inspired Inductive Biases For Deep Reinforcement Learning In Partially Observable Control Tasks
2023 Β· Ian Char, Jeff Schneider
Abstract
Deep reinforcement learning (RL) has shown immense potential for learning to control systems through data alone. However, one challenge deep RL faces is that the full state of the system is often not observable. When this is the case, the policy needs to leverage the history of observations to infer the current state. At the same time, differences between the training and testing environments makes it critical for the policy not to overfit to the sequence of observations it sees at training time. As such, there is an important balancing act between having the history encoder be flexible enough to extract relevant information, yet be robust to changes in the environment. To strike this balance, we look to the PID controller for inspiration. We assert the PID controller's success shows that only summing and differencing are needed to accumulate information over time for many control tasks. Following this principle, we propose two architectures for encoding history: one that directly uses
Authors
(none)
Tags
Stats
Related papers
- Learning Interpretable Policies In Hindsight-observable Pomdps Through Partially Supervised Reinforcement Learning (2024)2.26
- Active Inference And Reinforcement Learning: A Unified Inference On Continuous State And Action Spaces Under Partial Observability (2022)5.84
- Reinforcement Learning Under Partial Observability Guided By Learned Environment Models (2022)6.34
- Inverse Rational Control With Partially Observable Continuous Nonlinear Dynamics (2019)0.00
- Addressing Action Oscillations Through Learning Policy Inertia (2021)7.81
- Reinforcement Learning With Partial Parametric Model Knowledge (2023)0.00
- Task-guided Inverse Reinforcement Learning Under Partial Information (2021)0.00
- Variational Recurrent Models For Solving Partially Observable Control Tasks (2019)0.00