Maximum Entropy-regularized Multi-goal Reinforcement Learning
2019 Β· Rui Zhao, Xudong Sun, Volker Tresp
Abstract
In Multi-Goal Reinforcement Learning, an agent learns to achieve multiple goals with a goal-conditioned policy. During learning, the agent first collects the trajectories into a replay buffer, and later these trajectories are selected randomly for replay. However, the achieved goals in the replay buffer are often biased towards the behavior policies. From a Bayesian perspective, when there is no prior knowledge about the target goal distribution, the agent should learn uniformly from diverse achieved goals. Therefore, we first propose a novel multi-goal RL objective based on weighted entropy. This objective encourages the agent to maximize the expected return, as well as to achieve more diverse goals. Secondly, we developed a maximum entropy-based prioritization framework to optimize the proposed objective. For evaluation of this framework, we combine it with Deep Deterministic Policy Gradient, both with or without Hindsight Experience Replay. On a set of multi-goal robotic tasks of Op
Authors
(none)
Tags
Stats
Related papers
- Maximum Entropy Gain Exploration For Long Horizon Multi-goal Reinforcement Learning (2020)0.00
- Dense And Diverse Goal Coverage In Multi Goal Reinforcement Learning (2025)0.00
- Maximum Entropy Diverse Exploration: Disentangling Maximum Entropy Reinforcement Learning (2019)0.00
- Maximum Entropy RL (provably) Solves Some Robust RL Problems (2021)0.00
- Maximum Entropy Heterogeneous-agent Reinforcement Learning (2023)0.00
- Do You Need The Entropy Reward (in Practice)? (2022)0.00
- Off-policy Maximum Entropy RL With Future State And Action Visitation Measures (2024)0.00
- Surprise-adaptive Intrinsic Motivation For Unsupervised Reinforcement Learning (2024)0.00