FlowerTune LLM β Code flowertune-code Leaderboard
FlowerTune Code track β federated fine-tuning of LLMs for code generation. Avg Score aggregates Pass@1-style accuracy over MBPP, HumanEval, and MultiPL-E (JavaScript, C++). Β· Metric: Avg Score (higher is better)
| # | Model | Avg Score | Paper |
|---|---|---|---|
| 1 | ZeroOne.AI β Qwen3-8B | 65.27 | β |
| 2 | Massimo R. Scamarcia β Qwen3-4B | 60.45 | β |
| 3 | CAR@AIML β deepseek-coder-7b-instruct-v1.5 | 58.77 | β |
| 4 | FL-finetune-JB-DC β Qwen2.5-Coder-7B-Instruct | 56.08 | β |
| 5 | Massimo R. Scamarcia β Phi-4-mini-instruct | 49.00 | β |
| 6 | CAR@AIML β starcoder2-7b | 44.08 | β |
| 7 | Massimo R. Scamarcia β Qwen2.5-7B-Instruct | 34.40 | β |
| 8 | CAR@AIML β CodeLlama-7b-hf | 33.78 | β |