Informed Asymmetric Actor-critic: Leveraging Privileged Signals Beyond Full-state Access
2025 Β· Daniel Ebi, Gaspard Lambrechts, Damien Ernst, et al.
Abstract
Asymmetric actor-critic methods are widely used in partially observable reinforcement learning, but typically assume full state observability to condition the critic during training, which is often unrealistic in practice. We introduce the informed asymmetric actor-critic framework, allowing the critic to be conditioned on arbitrary state-dependent privileged signals without requiring access to the full state. We show that any such privileged signal yields unbiased policy gradient estimates, substantially expanding the set of admissible privileged information. This raises the problem of selecting the most adequate privileged information in order to improve learning. For this purpose, we propose two novel informativeness criteria: a dependence-based test that can be applied prior to training, and a criterion based on improvements in value prediction accuracy that can be applied post-hoc. Empirical results on partially observable benchmark tasks and synthetic environments demonstrate tha
Authors
(none)
Tags
Stats
Related papers
- Unbiased Asymmetric Reinforcement Learning Under Partial Observability (2021)2.26
- Provable Partially Observable Reinforcement Learning With Privileged Information (2024)2.26
- Relu To The Rescue: Improve Your On-policy Actor-critic With Positive Advantages (2023)0.00
- Multi-agent Off-policy Actor-critic Reinforcement Learning For Partially Observable Environments (2024)2.26
- Studying The Interplay Between The Actor And Critic Representations In Reinforcement Learning (2025)0.00
- Actor-critic Policy Optimization In Partially Observable Multiagent Environments (2018)0.00
- Discriminator-actor-critic: Addressing Sample Inefficiency And Reward Bias In Adversarial Imitation Learning (2018)0.00
- Actor-dual-critic Dynamics For Zero-sum And Identical-interest Stochastic Games (2026)0.00