A Theoretical Connection Between Statistical Physics And Reinforcement Learning

Abstract

Sequential decision making in the presence of uncertainty and stochastic dynamics gives rise to distributions over state/action trajectories in reinforcement learning (RL) and optimal control problems. This observation has led to a variety of connections between RL and inference in probabilistic graphical models (PGMs). Here we explore a different dimension to this relationship, examining reinforcement learning using the tools and abstractions of statistical physics. The central object in the statistical physics abstraction is the idea of a partition function \(\mathcal\{Z\}\), and here we construct a partition function from the ensemble of possible trajectories that an agent might take in a Markov decision process. Although value functions and \(Q\)-functions can be derived from this partition function and interpreted via average energies, the \(\mathcal\{Z\}\)-function provides an object with its own Bellman equation that can form the basis of alternative dynamic programming approach

A Theoretical Connection Between Statistical Physics And Reinforcement Learning

Abstract

Authors

Tags

Stats

Related papers