WebShop
Emerging21papers using it
2024first seen
The 'WebShop' dataset is a benchmark used to evaluate the performance of reinforcement learning agents in complex task environments.
Papers using WebShop (21)
- Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy OptimizationHindsight Credit Assignment for Long-Horizon LLM AgentsHera: Learning Long-Horizon Coordination for Device-Cloud Collaborative LLM AgentsSELAUR: Self Evolving LLM Agent via Uncertainty-aware RewardsGroup-in-Group Policy Optimization for LLM Agent TrainingSkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement LearningHiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM AgentsMeta-RL Induces Exploration in Language AgentsLearning Hierarchical Procedural Memory for LLM Agents through Bayesian Selection and Contrastive RefinementDEPO: Dual-Efficiency Preference Optimization for LLM AgentsReflect before Act: Proactive Error Correction in Language ModelsEnhancing Decision-Making of Large Language Models via Actor-CriticExploring Expert Failures Improves LLM Agent TuningHarnessing Uncertainty: Entropy-Modulated Policy Gradients for
Long-Horizon LLM AgentsWhere LLM Agents Fail and How They can Learn From FailuresStructured Agent Distillation for Large Language ModelRetrospex: Language Agent Meets Offline Reinforcement Learning CriticA Training-free LLM Framework with Interaction between Contextually
Related Subtasks in Solving Complex TasksEDGE: Efficient Data Selection for LLM Agents via Guideline
EffectivenessReSpAct: Harmonizing Reasoning, Speaking, and Acting Towards Building
Large Language Model-Based Conversational AI AgentsStateAct: Enhancing LLM Base Agents via Self-prompting and
State-tracking