Quantum Policy Gradient Algorithm With Optimized Action Decoding
2022 Β· Nico Meyer, Daniel D. Scherer, Axel Plinge, et al.
Abstract
Quantum machine learning implemented by variational quantum circuits (VQCs) is considered a promising concept for the noisy intermediate-scale quantum computing era. Focusing on applications in quantum reinforcement learning, we propose a specific action decoding procedure for a quantum policy gradient approach. We introduce a novel quality measure that enables us to optimize the classical post-processing required for action selection, inspired by local and global quantum measurements. The resulting algorithm demonstrates a significant performance improvement in several benchmark environments. With this technique, we successfully execute a full training routine on a 5-qubit hardware device. Our method introduces only negligible classical overhead and has the potential to improve VQC-based algorithms beyond the field of quantum reinforcement learning.
Authors
(none)
Tags
Stats
Related papers
- Quantum Natural Policy Gradients: Towards Sample-efficient Reinforcement Learning (2023)7.16
- Quantum Policy Iteration Via Amplitude Estimation And Grover Search -- Towards Quantum Advantage For Reinforcement Learning (2022)0.00
- Hybrid Quantum-classical Policy Gradient For Adaptive Control Of Cyber-physical Systems: A Comparative Study Of VQC Vs. MLP (2025)0.00
- Variational Quantum Soft Actor-critic (2021)0.00
- Accelerating Quantum Reinforcement Learning With A Quantum Natural Policy Gradient Based Approach (2025)0.00
- Variational Quantum Circuits For Deep Reinforcement Learning (2019)19.19
- From Classical Data To Quantum Advantage -- Quantum Policy Evaluation On Quantum Hardware (2025)0.00
- Quantum Reinforcement Learning By Adaptive Non-local Observables (2025)2.26