ProcessBench
Emerging11papers using it
4,891HF downloads
59HF likes
2025first seen
ProcessBench This repository contains the dataset of the ProcessBench benchmark proposed by Qwen Team. You can refer to our GitHub repository for the evaluation code and the prompt templates we use in this work. If you find this work relevant or helpful to your work, please kindly cite us: @article{processbench, title=
π€ Hugging Faceβ apache-2.0
Papers using ProcessBench (11)
- Unsupervised Process Reward ModelsRefCritic: Training Long Chain-of-Thought Critic Models with Refinement FeedbackGenPRM: Scaling Test-Time Compute of Process Reward Models via
Generative ReasoningProcess Reward Models That ThinkSPC: Evolving Self-Play Critic via Adversarial Games for LLM ReasoningSolve-Detect-Verify: Inference-Time Scaling with Flexible Generative
VerifierRL Tango: Reinforcing Generator and Verifier Together for Language
ReasoningTraining Step-Level Reasoning Verifiers with Formal Verification ToolsRefCritic: Training Long Chain-of-Thought Critic Models with Refinement
FeedbackRL Tango: Reinforcing Generator and Verifier Together for Language ReasoningEfficient Process Reward Model Training via Active Learning