← all datasets

GSM8K

Emerging
25papers using it
2024first seen

The GSM8K dataset is a benchmark that contains complex mathematical reasoning problems used to evaluate the reasoning abilities of large language models.

Papers using GSM8K (25)

GSM8K — datasets — ai-agents