VerifyBench
Emerging2papers using it
127HF downloads
17HF likes
2025first seen
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Yuchen Yan1,2,*, Jin Jiang2,3, Zhenbang Ren1,4, Yijun Li1, Xudong Cai1, Yang Liu2, Xin Xu5, Mengdi Zhang2, Jian Shao1,β , Yongliang Shen1,β , Jun Xiao1, Yueting Zhuang1 1Zhejiang University 2Meituan Group 3Peking university 4University of
π€ Hugging Faceβ mit