Mutual Information Tracks Policy Coherence In Reinforcement Learning
2025 Β· Cameron Reid, Wael Hafez, Amirhossein Nazeri
Abstract
Reinforcement Learning (RL) agents deployed in real-world environments face degradation from sensor faults, actuator wear, and environmental shifts, yet lack intrinsic mechanisms to detect and diagnose these failures. We present an information-theoretic framework that reveals both the fundamental dynamics of RL and provides practical methods for diagnosing deployment-time anomalies. Through analysis of state-action mutual information patterns in a robotic control task, we first demonstrate that successful learning exhibits characteristic information signatures: mutual information between states and actions steadily increases from 0.84 to 2.83 bits (238% growth) despite growing state entropy, indicating that agents develop increasingly selective attention to task-relevant patterns. Intriguingly, states, actions and next states joint mutual information, MI(S,A;S'), follows an inverted U-curve, peaking during early learning before declining as the agent specializes suggesting a transition
Authors
(none)
Tags
Stats
Related papers
- Which Mutual-information Representation Learning Objectives Are Sufficient For Control? (2021)0.00
- Robust Multi-agent Reinforcement Learning By Mutual Information Regularization (2023)0.00
- Mutual Information Regularized Offline Reinforcement Learning (2022)0.00
- Quantifying First-order Markov Violations In Noisy Reinforcement Learning: A Causal Discovery Approach (2025)0.00
- Conditional Mutual Information For Disentangled Representations In Reinforcement Learning (2023)0.00
- Policy Information Capacity: Information-theoretic Measure For Task Complexity In Deep Reinforcement Learning (2021)0.00
- Causal Influence Detection For Improving Efficiency In Reinforcement Learning (2021)0.00
- PMIC: Improving Multi-agent Reinforcement Learning With Progressive Mutual Information Collaboration (2022)0.00