RewardBench

Emerging

15papers using it

2024first seen

RewardBench is a benchmark dataset used to evaluate the performance of Generative Reward Models (GRMs) in reward modeling by providing a set of preference data for training and assessing pointwise reward predictions.

🔎 Find this dataset

Papers using RewardBench (15)

PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward Modeling2025 · 4 cites

reward-lens: A Mechanistic Interpretability Library for Reward Models2026

IRPM: Intergroup Relative Preference Modeling for Pointwise Generative Reward Models2026

Multi-Agent Collaborative Reward Design for Enhancing Reasoning in Reinforcement Learning2025

Tiny Reward Models2025

Efficient Online RFT with Plug-and-Play LLM Judges: Unlocking State-of-the-Art Performance2025

Intra-Trajectory Consistency for Reward Modeling2025

Act-Adaptive Margin: Dynamically Calibrating Reward Models for Subjective Ambiguity2025

Sentence-level Reward Model can Generalize Better for Aligning LLM from Human Preference2025

Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts2024 · 10 cites

RewardBench: Evaluating Reward Models for Language Modeling2024 · 5 cites

HelpSteer2: Open-source dataset for training top-performing reward models2024 · 2 cites

Post-hoc Reward Calibration: A Case Study on Length Bias2024 · 1 cites

Quantile Regression for Distributional Reward Models in RLHF2024

Evaluating Robustness of Reward Models for Mathematical Reasoning2024