OlympiadBench
Emerging7papers using it
985HF downloads
5HF likes
2025first seen
'OlympiadBench' is a benchmark used to evaluate complex reasoning capabilities of large language models (LLMs).
Papers using OlympiadBench (7)
- Transformation-Augmented GRPO for Enhancing Exploration in Reasoning of Large Language ModelsLycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse DecodingPrompting Test-Time Scaling Is A Strong LLM Reasoning Data AugmentationPairwise RM: Perform Best-of-N Sampling with Knockout TournamentSEED-GRPO: Semantic Entropy Enhanced GRPO for Uncertainty-Aware Policy
OptimizationMM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable
Step-Level SupervisionConfidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models