Trainability Issues In Quantum Policy Gradients
2024 Β· AndrΓ© Sequeira, Luis Paulo Santos, Luis Soares Barbosa
Abstract
This research explores the trainability of Parameterized Quantum circuit-based policies in Reinforcement Learning, an area that has recently seen a surge in empirical exploration. While some studies suggest improved sample complexity using quantum gradient estimation, the efficient trainability of these policies remains an open question. Our findings reveal significant challenges, including standard Barren Plateaus with exponentially small gradients and gradient explosion. These phenomena depend on the type of basis-state partitioning and mapping these partitions onto actions. For a polynomial number of actions, a trainable window can be ensured with a polynomial number of measurements if a contiguous-like partitioning of basis-states is employed. These results are empirically validated in a multi-armed bandit environment.
Authors
(none)
Tags
Stats
Related papers
- Quantum Natural Policy Gradients: Towards Sample-efficient Reinforcement Learning (2023)7.16
- On Quantum Natural Policy Gradients (2024)5.24
- Quantum Policy Iteration Via Amplitude Estimation And Grover Search -- Towards Quantum Advantage For Reinforcement Learning (2022)0.00
- Quantum Policy Gradient Algorithm With Optimized Action Decoding (2022)0.00
- Accelerating Quantum Reinforcement Learning With A Quantum Natural Policy Gradient Based Approach (2025)0.00
- Robustness And Generalization In Quantum Reinforcement Learning Via Lipschitz Regularization (2024)0.00
- Hybrid Quantum-classical Policy Gradient For Adaptive Control Of Cyber-physical Systems: A Comparative Study Of VQC Vs. MLP (2025)0.00
- Auxiliary Task-based Deep Reinforcement Learning For Quantum Control (2023)5.84