DSAC-C: Constrained Maximum Entropy For Robust Discrete Soft-actor Critic
2023 Β· Dexter Neo, Tsuhan Chen
Abstract
We present a novel extension to the family of Soft Actor-Critic (SAC) algorithms. We argue that based on the Maximum Entropy Principle, discrete SAC can be further improved via additional statistical constraints derived from a surrogate critic policy. Furthermore, our findings suggests that these constraints provide an added robustness against potential domain shifts, which are essential for safe deployment of reinforcement learning agents in the real-world. We provide theoretical analysis and show empirical results on low data regimes for both in-distribution and out-of-distribution variants of Atari 2600 games.
Authors
(none)
Tags
Stats
Related papers
- DSAC: Distributional Soft Actor-critic For Risk-sensitive Reinforcement Learning (2020)7.81
- Revisiting Discrete Soft Actor-critic (2022)0.00
- Off-policy Actor-critic In An Ensemble: Achieving Maximum General Entropy And Effective Environment Exploration In Deep Reinforcement Learning (2019)0.00
- Band-limited Soft Actor Critic Model (2020)0.00
- DR-SAC: Distributionally Robust Soft Actor-critic For Reinforcement Learning Under Uncertainty (2025)0.00
- Boosting Soft Actor-critic: Emphasizing Recent Experience Without Forgetting The Past (2019)0.00
- S\(^2\)AC: Energy-based Reinforcement Learning With Stein Soft Actor Critic (2024)2.41
- Improved Soft Actor-critic: Mixing Prioritized Off-policy Samples With On-policy Experience (2021)0.00