Expected Eligibility Traces
2020 Β· Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, et al.
Abstract
The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence. Eligibility traces enable efficient credit assignment to the recent sequence of states and actions experienced by the agent, but not to counterfactual sequences that could also have led to the current state. In this work, we introduce expected eligibility traces. Expected traces allow, with a single update, to update states and actions that could have preceded the current state, even if they did not do so on this occasion. We discuss when expected traces provide benefits over classic (instantaneous) traces in temporal-difference learning, and show that sometimes substantial improvements can be attained. We provide a way to smoothly interpolate between instantaneous and expected traces by a mechanism similar to bootstrapping, which ensures that the resulting
Authors
(none)
Tags
Stats
Related papers
- Predecessor Features (2022)0.00
- Trajectory-aware Eligibility Traces For Off-policy Reinforcement Learning (2023)0.00
- Meta-learning State-based Eligibility Traces For More Sample-efficient Policy Evaluation (2019)0.00
- Meta-learning Eligibility Traces For More Sample Efficient Temporal Difference Learning (2020)0.00
- Recall Traces: Backtracking Models For Efficient Reinforcement Learning (2018)0.00
- Tbq(\(\sigma\)): Improving Efficiency Of Trace Utilization For Off-policy Reinforcement Learning (2019)0.00
- Partially Observable Reinforcement Learning With Memory Traces (2025)0.00
- A Unified Approach For Multi-step Temporal-difference Learning With Eligibility Traces In Reinforcement Learning (2018)6.77