The Strongest Teacher Is Not Always the Best Teacher: Student-Centric Answer Selection

Zhengyu Hu·Zheyuan Xiao·Linxin Song·Fengqing Jiang·Yutai Li·Zhengyu Chen·Zhihan Xiong·Yue Liu·Junhao Lin·Yao Su·Lijie Hu·Kaize Ding·Xiao Teng·Radha Poovendran·2026

Google Scholar ↗Semantic Scholar ↗

cs.LG cs.AI cs.CL

Abstract

arXiv:2605.26872v1 Announce Type: new Abstract: LLM training increasingly relies on teacher-generated supervision, from synthetic responses to reasoning traces and tool-use demonstrations. Current practice often chooses the highest-performing teacher to generate student training data, implicitly treating teacher test performance as a proxy for teaching quality. We show that this assumption can fail: even when multiple teachers provide correct answers to the same question, the answer from the strongest teacher is not necessarily the best supervision for a given student. To address this gap, we propose Student-Centric Answer Sampling (SCAS), a framework that selects from verified teacher-generated answers according to their estimated student-centric learning cost. Motivated by a token-wise gradient decomposition, we derive an efficient forward-only proxy for this cost and use it to guide answer selection during training. Experiments across 30 teacher models, 6 student base models, and 8 tasks show that SCAS consistently improves student performance, suggesting that effective distillation should prioritize supervision matched to the current student rather than teacher strength alone.

Abstract

Related papers