← all datasets

AppWorld

Emerging
7papers using it
2025first seen

The 'AppWorld' dataset/benchmark contains a collection of applications and their associated metadata, used to evaluate the performance of language model agents in learning from agentic traces in a parallel execution context.

Papers using AppWorld (7)

AppWorld β€” datasets β€” llm-papers