← all datasets

Fineweb-2

Emerging
4papers using it
2025first seen

Fineweb2 is a benchmark used to evaluate heuristic filtering methods for curating multilingual training data for large language models.

Papers using Fineweb-2 (4)

Fineweb-2 β€” datasets β€” llm-papers