Arena-Hard

Emerging

13papers using it

84HF downloads

1HF likes

2024first seen

The 'Arena-Hard' dataset is a benchmark used to evaluate the performance of LLMs in alignment tasks by providing challenging scenarios that require reasoning and decision-making without verifiable ground-truth verifiers.

🤗 Hugging Face

Papers using Arena-Hard (13)

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards2026

References Improve LLM Alignment in Non-Verifiable Domains2026

Reward Model Routing in Alignment2025

Online Rubrics Elicitation from Pairwise Comparisons2025

TGDPO: Harnessing Token-Level Reward Guidance for Enhancing Direct Preference Optimization2025

Pretrain Value, Not Reward: Decoupled Value Policy Optimization2025

Scalable Reinforcement Post-Training Beyond Static Human Prompts: Evolving Alignment via Asymmetric Self-Play2024

DPO Meets PPO: Reinforced Token Optimization for RLHF2024

SimPO: Simple Preference Optimization with a Reference-Free Reward2024 · 17 cites

RLHF Workflow: From Reward Modeling to Online RLHF2024 · 3 cites

The Perfect Blend: Redefining RLHF with Mixture of Judges2024 · 2 cites

AlphaDPO: Adaptive Reward Margin for Direct Preference Optimization2024

T-REG: Preference Optimization with Token-Level Reward Regularization2024