Intrinsic Benefits Of Categorical Distributional Loss: Uncertainty-aware Regularized Exploration In Reinforcement Learning
2021 Β· Ke Sun, Yingnan Zhao, Enze Shi, et al.
Abstract
The remarkable empirical performance of distributional reinforcement learning (RL) has garnered increasing attention to understanding its theoretical advantages over classical RL. By decomposing the categorical distributional loss commonly employed in distributional RL, we find that the potential superiority of distributional RL can be attributed to a derived distribution-matching entropy regularization. This less-studied entropy regularization aims to capture additional knowledge of return distribution beyond only its expectation, contributing to an augmented reward signal in policy optimization. In contrast to the vanilla entropy regularization in MaxEnt RL, which explicitly encourages exploration by promoting diverse actions, the novel entropy regularization derived from categorical distributional loss implicitly updates policies to align the learned policy with (estimated) environmental uncertainty. Finally, extensive experiments verify the significance of this uncertainty-aware re
Authors
(none)
Tags
Stats
Related papers
- Exploring The Training Robustness Of Distributional Reinforcement Learning Against Noisy State Observations (2021)0.00
- A Comparative Analysis Of Expected And Distributional Reinforcement Learning (2019)9.76
- Distributional Reinforcement Learning With Regularized Wasserstein Loss (2022)0.00
- Marginalized State Distribution Entropy Regularization In Policy Optimization (2019)0.00
- A Robust Quantile Huber Loss With Interpretable Parameter Adjustment In Distributional Reinforcement Learning (2024)0.00
- A Comparative Theoretical Analysis Of Entropy Control Methods In Reinforcement Learning (2026)0.00
- Improving Robustness Via Risk Averse Distributional Reinforcement Learning (2020)0.00
- The Curious Price Of Distributional Robustness In Reinforcement Learning With A Generative Model (2023)0.00