Safety Correction From Baseline: Towards The Risk-aware Policy In Robotics Via Dual-agent Reinforcement Learning
2022 Β· Linrui Zhang, Zichen Yan, Li Shen, et al.
Abstract
Learning a risk-aware policy is essential but rather challenging in unstructured robotic tasks. Safe reinforcement learning methods open up new possibilities to tackle this problem. However, the conservative policy updates make it intractable to achieve sufficient exploration and desirable performance in complex, sample-expensive environments. In this paper, we propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. Concretely, the baseline agent is responsible for maximizing rewards under standard RL settings. Thus, it is compatible with off-the-shelf training techniques of unconstrained optimization, exploration and exploitation. On the other hand, the safe agent mimics the baseline agent for policy improvement and learns to fulfill safety constraints via off-policy RL tuning. In contrast to training from scratch, safe policy corre
Authors
(none)
Tags
Stats
Related papers
- Safe Reinforcement Learning With Dual Robustness (2023)8.60
- Actsafe: Active Exploration With Safety Constraints For Reinforcement Learning (2024)0.00
- Concurrent Learning Of Policy And Unknown Safety Constraints In Reinforcement Learning (2024)0.00
- Model-based Safe Deep Reinforcement Learning Via A Constrained Proximal Policy Optimization Algorithm (2022)5.24
- Conservative And Adaptive Penalty For Model-based Safe Reinforcement Learning (2021)0.00
- On The Robustness Of Safe Reinforcement Learning Under Observational Perturbations (2022)0.00
- Context-aware Safe Reinforcement Learning For Non-stationary Environments (2021)9.76
- One Risk To Rule Them All: A Risk-sensitive Perspective On Model-based Offline Reinforcement Learning (2022)3.58