GSM8K gsm8k Leaderboard
Auto-discovered from papers reporting GSM8K (Accuracy). Β· Metric: Accuracy (higher is better)
| # | Model | Accuracy | Paper |
|---|---|---|---|
| 1 | Beyond KL Divergence: Policy Optimization With Flexible Bregman Divergences For LLM Reasoning | 86.70 | β |
| 2 | AGPO: Adaptive Group Policy Optimization with Dual Statistical Feedback | 67.30 | β |
| 3 | Testing LLM Arithmetic Reasoning Generalization with Automatic Numeric-Remapping Attacks | 12.16 | β |