Decision Mamba: Reinforcement Learning Via Sequence Modeling With Selective State Spaces
2024 Β· Toshihiro Ota
Abstract
Decision Transformer, a promising approach that applies Transformer architectures to reinforcement learning, relies on causal self-attention to model sequences of states, actions, and rewards. While this method has shown competitive results, this paper investigates the integration of the Mamba framework, known for its advanced capabilities in efficient and effective sequence modeling, into the Decision Transformer architecture, focusing on the potential performance enhancements in sequential decision-making tasks. Our study systematically evaluates this integration by conducting a series of experiments across various decision-making environments, comparing the modified Decision Transformer, Decision Mamba, with its traditional counterpart. This work contributes to the advancement of sequential decision-making models, suggesting that the architecture and training methodology of neural networks can significantly impact their performance in complex tasks, and highlighting the potential of
Authors
(none)
Tags
Stats
Related papers
- Decision Mamba: A Multi-grained State Space Model With Self-evolution Regularization For Offline RL (2024)0.00
- DODT: Enhanced Online Decision Transformer Learning Through Dreamer's Actor-critic Trajectory Forecasting (2024)0.00
- Drama: Mamba-enabled Model-based Reinforcement Learning Is Sample And Parameter Efficient (2024)0.00
- Hierarchical Prompt Decision Transformer: Improving Few-shot Policy Generalization With Global And Adaptive Guidance (2024)0.00
- Offline Pre-trained Multi-agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks (2021)0.00
- Updet: Universal Multi-agent Reinforcement Learning Via Policy Decoupling With Transformers (2021)0.00
- Q-learning Decision Transformer: Leveraging Dynamic Programming For Conditional Sequence Modelling In Offline RL (2022)0.00
- Waypoint Transformer: Reinforcement Learning Via Supervised Learning With Intermediate Targets (2023)0.00