ALFWorld
Emerging32papers using it
18HF downloads
0HF likes
2024first seen
ALFWorld is a benchmark dataset used to evaluate agent systems' performance in executing tasks by leveraging textual skills while minimizing context overhead.
Papers using ALFWorld (32)
- Reinforcement World Model Learning for LLM-based AgentsHindsight Credit Assignment for Long-Horizon LLM AgentsHera: Learning Long-Horizon Coordination for Device-Cloud Collaborative LLM AgentsSELAUR: Self Evolving LLM Agent via Uncertainty-aware RewardsGroup-in-Group Policy Optimization for LLM Agent TrainingBlueprint First, Model Second: A Framework for Deterministic LLM WorkflowSkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement LearningLatentSkill: From In-Context Textual Skills to In-Weight Latent Skills for LLM AgentsPaying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM AgentsHierarchical Reinforcement Learning with Augmented Step-Level Transitions for LLM AgentsHiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM AgentsMemSkill: Learning and Evolving Memory Skills for Self-Evolving AgentsThink Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM AgentsReflecting with Two Voices: A Co-Adaptive Dual-Strategy Framework for LLM-Based Agent Decision MakingLearning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive RefinementSkillGen: Learning Domain Skills for In-Context Sequential Decision MakingReflect before Act: Proactive Error Correction in Language ModelsMemory-Driven Self-Improvement for Decision Making with Large Language ModelsEnhancing Decision-Making of Large Language Models via Actor-CriticGTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based
VLM Agent TrainingWALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World
Model-based LLM AgentsSelf-Generated In-Context Examples Improve LLM Agents for Sequential
Decision-Making TasksMemp: Exploring Agent Procedural MemoryHarnessing Uncertainty: Entropy-Modulated Policy Gradients for
Long-Horizon LLM AgentsWhere LLM Agents Fail and How They can Learn From FailuresGenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment SimulatorsSelf-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making TasksCache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous DomainsStructured Agent Distillation for Large Language ModelRetrospex: Language Agent Meets Offline Reinforcement Learning CriticReSpAct: Harmonizing Reasoning, Speaking, and Acting Towards Building
Large Language Model-Based Conversational AI AgentsStateAct: Enhancing LLM Base Agents via Self-prompting and
State-tracking