Qwen-2.5-math-7B
Emerging4papers using it
2025first seen
The 'Qwen2.5-Math-7B' dataset/benchmark contains prompts designed for evaluating reinforcement learning with verifiable rewards (RLVR) in deterministic outcome reasoning tasks.
The 'Qwen2.5-Math-7B' dataset/benchmark contains prompts designed for evaluating reinforcement learning with verifiable rewards (RLVR) in deterministic outcome reasoning tasks.