Inverse Rational Control With Partially Observable Continuous Nonlinear Dynamics
2019 Β· Saurabh Daptardar, Paul Schrater, Xaq Pitkow
Abstract
Continuous control and planning remains a major challenge in robotics and machine learning. Neuroscience offers the possibility of learning from animal brains that implement highly successful controllers, but it is unclear how to relate an animal's behavior to control principles. Animals may not always act optimally from the perspective of an external observer, but may still act rationally: we hypothesize that animals choose actions with highest expected future subjective value according to their own internal model of the world. Their actions thus result from solving a different optimal control problem from those on which they are evaluated in neuroscience experiments. With this assumption, we propose a novel framework of model-based inverse rational control that learns the agent's internal model that best explains their actions in a task described as a partially observable Markov decision process (POMDP). In this approach we first learn optimal policies generalized over the entire mod
Authors
(none)
Tags
Stats
Related papers
- Probabilistic Inverse Optimal Control For Non-linear Partially Observable Systems Disentangles Perceptual Uncertainty And Behavioral Costs (2023)0.00
- RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm For Continuous Control Of Nonlinear Dynamical Systems (2019)0.00
- Inverse Rational Control: Inferring What You Think From How You Forage (2018)0.00
- Inverse Optimal Control Adapted To The Noise Characteristics Of The Human Sensorimotor System (2021)0.00
- Active Inference And Reinforcement Learning: A Unified Inference On Continuous State And Action Spaces Under Partial Observability (2022)5.84
- Pid-inspired Inductive Biases For Deep Reinforcement Learning In Partially Observable Control Tasks (2023)0.00
- A Goal-based Movement Model For Continuous Multi-agent Tasks (2017)0.00
- Model-based Reinforcement Learning For Control Under Time-varying Dynamics (2026)0.00