Unsupervised Real-time Control Through Variational Empowerment
2017 Β· Maximilian Karl, Maximilian Soelch, Philip Becker-Ehmck, et al.
Abstract
We introduce a methodology for efficiently computing a lower bound to empowerment, allowing it to be used as an unsupervised cost function for policy learning in real-time control. Empowerment, being the channel capacity between actions and states, maximises the influence of an agent on its near future. It has been shown to be a good model of biological behaviour in the absence of an extrinsic goal. But empowerment is also prohibitively hard to compute, especially in nonlinear continuous spaces. We introduce an efficient, amortised method for learning empowerment-maximising policies. We demonstrate that our algorithm can reliably handle continuous dynamical systems using system dynamics learned from raw data. The resulting policies consistently drive the agents into states where they can use their full potential.
Authors
(none)
Tags
Stats
Related papers
- A Unified Bellman Optimality Principle Combining Reward Maximization And Empowerment (2019)0.00
- Variational Empowerment As Representation Learning For Goal-based Reinforcement Learning (2021)0.00
- Extremum-seeking Action Selection For Accelerating Policy Optimization (2024)0.00
- Optimal Exploration For Model-based RL In Nonlinear Systems (2023)0.00
- Model-based Reinforcement Learning For Control Under Time-varying Dynamics (2026)0.00
- Sample Complexity Of Estimating The Policy Gradient For Nearly Deterministic Dynamical Systems (2019)0.00
- An Efficient, Expressive And Local Minima-free Method For Learning Controlled Dynamical Systems (2017)2.26
- An Optimal Policy For Learning Controllable Dynamics By Exploration (2025)0.00