RMBench
Emerging6papers using it
2025first seen
RMBench is a benchmark dataset used to evaluate the performance of reward models in aligning Large Language Models with human preferences.
Papers using RMBench (6)
- PaTaRM: Bridging Pairwise and Pointwise Signals via Preference-Aware Task-Adaptive Reward ModelingCDRRM: Contrast-Driven Rubric Generation for Reliable and Interpretable Reward ModelingHelpSteer3-Preference: Open Human-Annotated Preference Data across
Diverse Tasks and LanguagesError Typing for Smarter Rewards: Improving Process Reward Models with
Error-Aware Hierarchical SupervisionHelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and LanguagesEfficient Process Reward Model Training via Active Learning