← all datasets

UltraFeedback

Emerging
6papers using it
5,121HF downloads
423HF likes
2025first seen

Introduction GitHub Repo UltraRM-13b UltraCM-13b UltraFeedback is a large-scale, fine-grained, diverse preference dataset, used for training powerful reward models and critic models. We collect about 64k prompts from diverse resources (including UltraChat, ShareGPT, Evol-Instruct, TruthfulQA, FalseQA, and FLAN). We the

Papers using UltraFeedback (6)

UltraFeedback β€” datasets β€” llm-papers