Abstract
Medical question answering systems are increasingly used in health- care and clinical decision support, yet many existing approaches struggle with complex, multi-step medical reasoning. This paper presents MAPC, a Multi-Agent Planning-Collaboration framework that decomposes each question into three stages: a Planning Agent that generates targeted sub-questions, a Role-Playing Agent that answers them using evidence-based reasoning, and a Polishing Agent that synthesizes a final decision. Built on the Qwen3-30B- A3B-Instruct model, MAPC achieves 76.0% on MedQA, 79.0% on MEDMCQA, and 92.0% on PubMedQA, surpassing strong baselines including GPT-4 and Med-PaLM 2, with a 10.2 percentage point gain over the prior best result on PubMedQA. These results show that structured multi-agent planning and JSON-constrained prompting can enhance robustness and reduce hallucinations in medical QA, highlighting multi-agent collaboration as a promising paradigm for more accurate and interpretable medical AI systems.