Abstract

Deep Actor-Critic algorithms, which combine Actor-Critic with deep neural network (DNN), have been among the most prevalent reinforcement learning algorithms for decision-making problems in simulated environments. However, the existing deep Actor-Critic algorithms are still not mature to solve realistic problems with non-convex stochastic constraints and high cost to interact with the environment. In this paper, we propose a single-loop deep Actor-Critic (SLDAC) algorithmic framework for general constrained reinforcement learning (CRL) problems. In the actor step, the constrained stochastic successive convex approximation (CSSCA) method is applied to handle the non-convex stochastic objective and constraints. In the critic step, the critic DNNs are only updated once or a few finite times for each iteration, which simplifies the algorithm to a single-loop framework (the existing works require a sufficient number of updates for the critic step to ensure a good enough convergence of the i

Authors

(none)

Tags

  • Uncategorized

Stats

  • citations4
  • S2 citationsβ€”
  • github stars0
  • HF likes0
  • heat score5.24
  • arxiv keywang2023a

Related papers