Harnessing The Power Of Reinforcement Learning For Adaptive MCMC
2025 Β· Congye Wang, Matthew A. Fisher, Heishiro Kanagawa, et al.
Abstract
Sampling algorithms drive probabilistic machine learning, and recent years have seen an explosion in the diversity of tools for this task. However, the increasing sophistication of sampling algorithms is correlated with an increase in the tuning burden. There is now a greater need than ever to treat the tuning of samplers as a learning task in its own right. In a conceptual breakthrough, Wang et al (2025) formulated Metropolis-Hastings as a Markov decision process, opening up the possibility for adaptive tuning using Reinforcement Learning (RL). Their emphasis was on theoretical foundations; realising the practical benefit of Reinforcement Learning Metropolis-Hastings (RLMH) was left for subsequent work. The purpose of this paper is twofold: First, we observe the surprising result that natural choices of reward, such as the acceptance rate, or the expected squared jump distance, provide insufficient signal for training RLMH. Instead, we propose a novel reward based on the contrastive d
Authors
(none)
Tags
Stats
Related papers
- Maximum Likelihood Reinforcement Learning (2026)2.05
- Sampling Attacks On Meta Reinforcement Learning: A Minimax Formulation And Complexity Analysis (2022)0.00
- Towards An Adaptable And Generalizable Optimization Engine In Decision And Control: A Meta Reinforcement Learning Approach (2024)0.00
- A Tutorial On Meta-reinforcement Learning (2023)10.85
- Reinforcement Learning: A Comparison Of UCB Versus Alternative Adaptive Policies (2019)0.00
- Learning To Control Dynamical Agents Via Spiking Neural Networks And Metropolis-hastings Sampling (2025)0.00
- Memory Sequence Length Of Data Sampling Impacts The Adaptation Of Meta-reinforcement Learning Agents (2024)2.26
- Robust Reinforcement Learning Via Adversarial Training With Langevin Dynamics (2020)0.00