MBPP mbpp Leaderboard
Auto-discovered from papers reporting MBPP (pass@1). Β· Metric: pass@1 (higher is better)
| # | Model | pass@1 | Paper |
|---|---|---|---|
| 1 | CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models | 98.70 | β |
| 2 | Poison with Style: A Practical Poisoning Attack on Code Large Language Models | 95.00 | β |
| 3 | CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging | 90.70 | β |
| 4 | Planning-Driven Programming: A Large Language Model Programming Workflow | 84.80 | β |
| 5 | BatCoder: Self-Supervised Bidirectional Code-Documentation Learning via Back-Translation | 81.00 | β |
| 6 | Modularization is Better: Effective Code Generation with Modular Prompting | 58.10 | β |
| 7 | FLeX: Fourier-based Low-rank EXpansion for multilingual transfer | 40.10 | β |
| 8 | Context-Augmented Code Generation Using Programming Knowledge Graphs | 34.00 | β |
| 9 | Selection of Prompt Engineering Techniques for Code Generation through Predicting Code Complexity | 1.90 | β |