← all datasets

Anthropic HH-RLHF

Emerging
4papers using it
20HF downloads
4HF likes
2025first seen

The 'Anthropic HH-RLHF' dataset/benchmark contains human feedback data used to evaluate reinforcement learning models trained with a focus on privacy-preserving techniques.

Papers using Anthropic HH-RLHF (4)

Anthropic HH-RLHF β€” datasets β€” reinforcement-learning