Agent-temporal Attention For Reward Redistribution In Episodic Multi-agent Reinforcement Learning
2022 Β· Baicen Xiao, Bhaskar Ramasubramanian, Radha Poovendran
Abstract
This paper considers multi-agent reinforcement learning (MARL) tasks where agents receive a shared global reward at the end of an episode. The delayed nature of this reward affects the ability of the agents to assess the quality of their actions at intermediate time-steps. This paper focuses on developing methods to learn a temporal redistribution of the episodic reward to obtain a dense reward signal. Solving such MARL problems requires addressing two challenges: identifying (1) relative importance of states along the length of an episode (along time), and (2) relative importance of individual agents' states at any single time-step (among agents). In this paper, we introduce Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement Learning (AREL) to address these two challenges. AREL uses attention mechanisms to characterize the influence of actions on state transitions along trajectories (temporal attention), and how each agent is affected by other age
Authors
(none)
Tags
Stats
Related papers
- Agent-time Attention For Sparse Rewards Multi-agent Reinforcement Learning (2022)0.00
- Agent-temporal Credit Assignment For Optimal Policy Preservation In Sparse Multi-agent Reinforcement Learning (2024)0.00
- STAS: Spatial-temporal Return Decomposition For Multi-agent Reinforcement Learning (2023)0.00
- Hierarchical Deep Multiagent Reinforcement Learning With Temporal Abstraction (2018)0.00
- Distributional Reward Estimation For Effective Multi-agent Deep Reinforcement Learning (2022)0.00
- Multi-agent Reinforcement Learning With Reward Delays (2022)0.00
- Multi-agent Reinforcement Learning Via Adaptive Kalman Temporal Difference And Successor Representation (2021)0.00
- MIR: Efficient Exploration In Episodic Multi-agent Reinforcement Learning Via Mutual Intrinsic Reward (2025)0.00