FlowerTune LLM β Medical flowertune-medical Leaderboard
FlowerTune Medical track β federated fine-tuning of LLMs on medical instruction data that cannot be centralized for privacy reasons. Avg Score aggregates accuracy over PubMedQA, MedMCQA, MedQA, and CareQA. Β· Metric: Avg Score (higher is better)
| # | Model | Avg Score | Paper |
|---|---|---|---|
| 1 | ZeroOne.AI β Llama3.1-Aloe-Beta-8B | 63.57 | β |
| 2 | FL-finetune-JB-DC β Bio-Medical-Llama-3-8B | 63.12 | β |
| 3 | AI4EOSC Team β Bio-Medical-Llama-3-8B | 62.14 | β |
| 4 | Gachon Cognitive Computing Lab β Bio-Medical-Llama-3-8B | 62.12 | β |
| 5 | mHealth Lab β Bio-Medical-Llama-3-8B | 62.03 | β |
| 6 | ZJUDAI β Llama-3.1-8B-Instruct | 61.75 | β |
| 7 | ZJUDAI β Qwen2.5-7B-Instruct | 60.64 | β |