Action Redundancy In Reinforcement Learning
2021 Β· Nir Baram, Guy Tennenholtz, Shie Mannor
Abstract
Maximum Entropy (MaxEnt) reinforcement learning is a powerful learning paradigm which seeks to maximize return under entropy regularization. However, action entropy does not necessarily coincide with state entropy, e.g., when multiple actions produce the same transition. Instead, we propose to maximize the transition entropy, i.e., the entropy of next states. We show that transition entropy can be described by two terms; namely, model-dependent transition entropy and action redundancy. Particularly, we explore the latter in both deterministic and stochastic settings and develop tractable approximation methods in a near model-free setup. We construct algorithms to minimize action redundancy and demonstrate their effectiveness on a synthetic environment with multiple redundant actions as well as contemporary benchmarks in Atari and Mujoco. Our results suggest that action redundancy is a fundamental problem in reinforcement learning.
Authors
(none)
Tags
Stats
Related papers
- Off-policy Maximum Entropy RL With Future State And Action Visitation Measures (2024)0.00
- Do You Need The Entropy Reward (in Practice)? (2022)0.00
- Maximum-entropy Exploration With Future State-action Visitation Measures (2026)0.00
- Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning (2019)0.00
- Maximum Entropy RL (provably) Solves Some Robust RL Problems (2021)0.00
- Fast Rates For Maximum Entropy Exploration (2023)0.00
- Tsallis Reinforcement Learning: A Unified Framework For Maximum Entropy Reinforcement Learning (2019)0.00
- No Prior Mask: Eliminate Redundant Action For Deep Reinforcement Learning (2023)1.81