Gaia-2
Emerging2papers using it
2025first seen
The 'Gaia2' dataset/benchmark contains a variety of tasks designed to evaluate the performance and adaptability of autonomous agents, particularly in their ability to learn from past interactions and improve their problem-solving capabilities.