Arena-Hard
Emerging24papers using it
66HF downloads
1HF likes
2024first seen
The 'Arena-Hard' dataset is a benchmark used to evaluate the performance of reasoning models by assessing their ability to generate outputs that can deceive other LLM judges.
Papers using Arena-Hard (24)
- Alignment through Meta-Weighted Online Sampling: Bridging the Gap between Data Generation and Preference OptimizationToken-weighted Direct Preference Optimization with AttentionMMoA: An AI-Agent framework with recurrence for Memoried Mixure-of-AgentExamining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-TrainingReferences Improve LLM Alignment in Non-Verifiable DomainsThe Art of Asking: Multilingual Prompt Optimization for Synthetic DataIcon$^{2}$: Aligning Large Language Models Using Self-Synthetic Preference Data via Inherent RegulationNot All Preferences are What You Need for Post-Training: Selective Alignment Strategy for Preference OptimizationP3: Prompts Promote PromptingSGPO: Self-Generated Preference Optimization based on Self-ImproverRobust Preference Optimization via Dynamic Target MarginsConfPO: Exploiting Policy Model Confidence for Critical Token Selection in Preference OptimizationFuseChat-3.0: Preference Optimization Meets Heterogeneous Model FusionLLaDA 1.5: Variance-Reduced Preference Optimization for Large Language
Diffusion ModelsTower+: Bridging Generality and Translation Specialization in
Multilingual LLMsMaPPO: Maximum a Posteriori Preference Optimization with Prior KnowledgeAlignment through Meta-Weighted Online Sampling: Bridging the Gap
between Data Generation and Preference OptimizationExploring the Potential of Offline RL for Reasoning in LLMs: A
Preliminary StudyComPO: Preference Alignment via Comparison OraclesRSPO: Regularized Self-Play Alignment of Large Language ModelsCrowdSelect: Synthetic Instruction Data Selection with Multi-LLM WisdomCapturing Nuanced Preferences: Preference-Aligned Distillation for Small
Language ModelsT-REG: Preference Optimization with Token-Level Reward RegularizationNILE: Internal Consistency Alignment in Large Language Models