Unsupervised Predictive Memory In A Goal-directed Agent
2018 Β· Greg Wayne, Chia-Chun Hung, David Amos, et al.
Abstract
Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement learning (RL) algorithms with deep neural networks, and the excitement surrounding these results has led to the pursuit of related ideas as explanations of non-human animal learning. However, we demonstrate that contemporary RL algorithms struggle to solve simple tasks when enough information is concealed from the sensors of the agent, a property called "partial observability". An obvious requirement for handling partially observed tasks is access to extensive memory, but we show memory is not enough; it is critical that the right information be stored in the right format. We develop a model, t
Authors
(none)
Tags
Stats
Related papers
- Learning, Fast And Slow: A Goal-directed Memory-based Approach For Dynamic Environments (2023)0.00
- Stable Hadamard Memory: Revitalizing Memory-augmented Agents For Reinforcement Learning (2024)0.00
- The Act Of Remembering: A Study In Partially Observable Reinforcement Learning (2020)0.00
- Ego-foresight: Self-supervised Learning Of Agent-aware Representations For Improved RL (2024)0.00
- Self-adapting Goals Allow Transfer Of Predictive Models To New Tasks (2019)2.26
- Towards Mental Time Travel: A Hierarchical Memory For Reinforcement Learning Agents (2021)0.00
- Dynamic Memory For Interpretable Sequential Optimisation (2022)0.00
- Curious Exploration And Return-based Memory Restoration For Deep Reinforcement Learning (2021)0.00