Goal-oriented Inference Of Environment From Redundant Observations
2023 Β· Kazuki Takahashi, Tomoki Fukai, Yutaka Sakai, et al.
Abstract
The agent learns to organize decision behavior to achieve a behavioral goal, such as reward maximization, and reinforcement learning is often used for this optimization. Learning an optimal behavioral strategy is difficult under the uncertainty that events necessary for learning are only partially observable, called as Partially Observable Markov Decision Process (POMDP). However, the real-world environment also gives many events irrelevant to reward delivery and an optimal behavioral strategy. The conventional methods in POMDP, which attempt to infer transition rules among the entire observations, including irrelevant states, are ineffective in such an environment. Supposing Redundantly Observable Markov Decision Process (ROMDP), here we propose a method for goal-oriented reinforcement learning to efficiently learn state transition rules among reward-related "core states'' from redundant observations. Starting with a small number of initial core states, our model gradually adds new co
Authors
(none)
Tags
Stats
Related papers
- Optimal Decision-making In Mixed-agent Partially Observable Stochastic Environments Via Reinforcement Learning (2019)0.00
- Active Inference And Reinforcement Learning: A Unified Inference On Continuous State And Action Spaces Under Partial Observability (2022)5.84
- An Agent Design With Goal Reaching Guarantees For Enhancement Of Learning (2024)0.00
- Correlation Priors For Reinforcement Learning (2019)0.00
- Task-guided IRL In Pomdps That Scales (2022)2.26
- Inverse Rational Control: Inferring What You Think From How You Forage (2018)0.00
- Reinforcement Learning Under Partial Observability Guided By Learned Environment Models (2022)6.34
- Sample-efficient Reinforcement Learning In The Presence Of Exogenous Information (2022)0.00