Understanding And Improving Hyperbolic Deep Reinforcement Learning
2025 Β· Timo Klein, Thomas Lang, Andrii Shkabrii, et al.
Abstract
The exponential volume growth of hyperbolic geometry can embed the hierarchical relationships between states in reinforcement learning (RL) with far less distortion than Euclidean space. However, hyperbolic deep RL faces severe optimization challenges, and formal analysis of why optimization fails is lacking. We identify key factors that determine the success and failure of training hyperbolic deep RL agents. By analyzing the gradients of core operations in the Poincar\'e Ball and Hyperboloid models of hyperbolic geometry, we show that large-norm embeddings destabilize gradient-based training, leading to trust-region violations in proximal policy optimization (PPO). Based on these insights, we introduce Hyper++, a new hyperbolic deep RL agent that consists of three components: (1) feature regularization guaranteeing bounded norms while avoiding the curse of dimensionality from clipping; (2) a categorical value loss for stable critic training; and (3) a more optimization-friendly formul
Authors
(none)
Tags
Stats
Related papers
- Hyperl: Hypernetwork-based Reinforcement Learning For Control Of Parametrized Dynamical Systems (2025)0.00
- Hyperparameter Optimisation With Practical Interpretability And Explanation Methods In Probabilistic Curriculum Learning (2025)0.00
- The Effective Horizon Explains Deep RL Performance In Stochastic Environments (2023)3.42
- Rethinking KL Regularization In RLHF: From Value Estimation To Gradient Optimization (2025)0.00
- Exploration-driven Policy Optimization In RLHF: Theoretical Insights On Efficient Data Utilization (2024)0.00
- Hyperparameter Tuning For Deep Reinforcement Learning Applications (2022)0.00
- Dissecting Deep RL With High Update Ratios: Combatting Value Divergence (2024)0.00
- Halypo: Heterogeneous-agent Lyapunov Policy Optimization For Human-robot Collaboration (2026)0.00