GPQA-Diamond
Emerging4papers using it
2025first seen
The 'GPQA Diamond' is a benchmark dataset used to evaluate the performance of reasoning models in various reasoning tasks.
Papers using GPQA-Diamond (4)
- OpenThoughts: Data Recipes for Reasoning ModelsQuantLRM: Quantization of Large Reasoning Models via Fine-Tuning SignalsPRISM: Pushing the Frontier of Deep Think via Process Reward Model-Guided InferenceOpen-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement
Learning on the Base Model