Decomposed Mutual Information Optimization For Generalized Context In Meta-reinforcement Learning
2022 Β· Yao Mu, Yuzheng Zhuang, Fei Ni, et al.
Abstract
Adapting to the changes in transition dynamics is essential in robotic applications. By learning a conditional policy with a compact context, context-aware meta-reinforcement learning provides a flexible way to adjust behavior according to dynamics changes. However, in real-world applications, the agent may encounter complex dynamics changes. Multiple confounders can influence the transition dynamics, making it challenging to infer accurate context for decision-making. This paper addresses such a challenge by Decomposed Mutual INformation Optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories, while minimizing the state transition prediction error. Our theoretical analysis shows that DOMINO can overcome the underestimation of the mutual information caused by multi-confounded challenges via learning disentangled context and reduce the demand for the number of samples co
Authors
(none)
Tags
Stats
Related papers
- Dynamics Generalisation In Reinforcement Learning Via Adaptive Context-aware Policies (2023)2.26
- Efficient Off-policy Meta-reinforcement Learning Via Probabilistic Context Variables (2019)0.00
- Prototypical Context-aware Dynamics Generalization For High-dimensional Model-based Reinforcement Learning (2022)0.00
- Reinforcement Learning In Presence Of Discrete Markovian Context Evolution (2022)0.00
- Context Meta-reinforcement Learning Via Neuromodulation (2021)6.34
- Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-based Offline Meta Reinforcement Learning (2024)0.00
- Context-based Soft Actor Critic For Environments With Non-stationary Dynamics (2021)0.00
- Contextual Intelligence The Next Leap For Reinforcement Learning (2026)0.00