Beyond Sliding Windows: Learning To Manage Memory In Non-markovian Environments
2025 Β· Geraud Nangue Tasse, Matthew Riemer, Benjamin Rosman, et al.
Abstract
Recent success in developing increasingly general purpose agents based on sequence models has led to increased focus on the problem of deploying computationally limited agents within the vastly more complex real-world. A key challenge experienced in these more realistic domains is highly non-Markovian dependencies with respect to the agent's observations, which are less common in small controlled domains. The predominant approach for dealing with this in the literature is to stack together a window of the most recent observations (Frame Stacking), but this window size must grow with the degree of non-Markovian dependencies, which results in prohibitive computational and memory requirements for both action inference and learning. In this paper, we are motivated by the insight that in many environments that are highly non-Markovian with respect to time, the environment only causally depends on a relatively small number of observations over that time-scale. A natural direction would then
Authors
(none)
Tags
Stats
Related papers
- Learning, Fast And Slow: A Goal-directed Memory-based Approach For Dynamic Environments (2023)0.00
- Unsupervised Predictive Memory In A Goal-directed Agent (2018)0.00
- Episodic Memory For Learning Subjective-timescale Models (2020)0.00
- Dynamic Memory For Interpretable Sequential Optimisation (2022)0.00
- Stable Hadamard Memory: Revitalizing Memory-augmented Agents For Reinforcement Learning (2024)0.00
- A Survey Of Learning In Multiagent Environments: Dealing With Non-stationarity (2017)0.00
- Uncertainty Maximization In Partially Observable Domains: A Cognitive Perspective (2021)0.00
- Partial Models For Building Adaptive Model-based Reinforcement Learning Agents (2024)0.00