Sequential Bayesian Optimal Experimental Design In Infinite Dimensions Via Policy Gradient Reinforcement Learning
2026 Β· Kaichen Shen, Peng Chen
Abstract
Sequential Bayesian optimal experimental design (SBOED) for PDE-governed inverse problems is computationally challenging, especially for infinite-dimensional random field parameters. High-fidelity approaches require repeated forward and adjoint PDE solves inside nested Bayesian inversion and design loops. We formulate SBOED as a finite-horizon Markov decision process and learn an amortized design policy via policy-gradient reinforcement learning (PGRL), enabling online design selection from the experiment history without repeatedly solving an SBOED optimization problem. To make policy training and reward evaluation scalable, we combine dual dimension reduction -- active subspace projection for the parameter and principal component analysis for the state -- with an adjusted derivative-informed latent attention neural operator (LANO) surrogate that predicts both the parameter-to-solution map and its Jacobian. We use a Laplace-based D-optimality reward while noting that, in general, other
Authors
(none)
Tags
Stats
Related papers
- Statistically Efficient Bayesian Sequential Experiment Design Via Reinforcement Learning With Cross-entropy Estimators (2023)0.00
- Learning Optimal Deterministic Policies With Stochastic Policy Gradients (2024)0.00
- Proximal Policy Optimization Algorithms (2017)0.00
- Revisiting Design Choices In Proximal Policy Optimization (2020)0.00
- ANO: A Principled Approach To Robust Policy Optimization (2026)0.00
- Sequential Bayesian Experimental Designs Via Reinforcement Learning (2022)0.00
- Sequential Monte Carlo For Policy Optimization In Continuous Pomdps (2025)0.00
- Direct Policy Gradients: Direct Optimization Of Policies In Discrete Action Spaces (2019)0.00