← all datasets

HealthBench-Hard

Emerging
3papers using it
92HF downloads
0HF likes
2025first seen

HealthBench-Hard is a benchmark used to evaluate the alignment of large language models with clinician preferences in healthcare contexts.

Papers using HealthBench-Hard (3)

HealthBench-Hard β€” datasets β€” llm-papers