Intervention-assisted Policy Gradient Methods For Online Stochastic Queuing Network Optimization: Technical Report
2024 Β· Jerrod Wigmore, Brooke Shrader, Eytan Modiano
Abstract
Deep Reinforcement Learning (DRL) offers a powerful approach to training neural network control policies for stochastic queuing networks (SQN). However, traditional DRL methods rely on offline simulations or static datasets, limiting their real-world application in SQN control. This work proposes Online Deep Reinforcement Learning-based Controls (ODRLC) as an alternative, where an intelligent agent interacts directly with a real environment and learns an optimal control policy from these online interactions. SQNs present a challenge for ODRLC due to the unbounded nature of the queues within the network resulting in an unbounded state-space. An unbounded state-space is particularly challenging for neural network policies as neural networks are notoriously poor at extrapolating to unseen states. To address this challenge, we propose an intervention-assisted framework that leverages strategic interventions from known stable policies to ensure the queue sizes remain bounded. This framework
Authors
(none)
Tags
Stats
Related papers
- A Novel Switch-type Policy Network For Resource Allocation Problems: Technical Report (2025)0.00
- Quantile-based Deep Reinforcement Learning Using Two-timescale Policy Gradient Algorithms (2023)0.00
- Deep Q-networks For Accelerating The Training Of Deep Neural Networks (2016)0.00
- Offline Reinforcement Learning For Wireless Network Optimization With Mixture Datasets (2023)9.59
- Adaptive \(q\)-network: On-the-fly Target Selection For Deep Reinforcement Learning (2024)0.00
- The Effective Horizon Explains Deep RL Performance In Stochastic Environments (2023)3.42
- Optimistic Natural Policy Gradient: A Simple Efficient Policy Optimization Framework For Online RL (2023)0.00
- Online Reinforcement Learning In Non-stationary Context-driven Environments (2023)0.00