Identifying and Mitigating Bottlenecks in Role-Playing Agents: A Systematic Study of Disentangling Character Profile Axes

Abstract

arXiv:2601.04716v3 Announce Type: replace Abstract: While Large Language Model (LLM) role-playing agents have advanced rapidly, it remains unclear which profile elements genuinely drive role-playing quality. To bridge this gap, we introduce a systematic diagnostic framework that disentangles the impact of character profiles along three axes: Familiarity (Known vs. Unknown), Structure (Structured vs. Unstructured), and Disposition (Moral vs. Immoral). Utilizing a unified hierarchical schema (5 dimensions, 28 fields), we construct a controlled dataset of 211 personas and evaluate five LLMs on both single- and multi-turn interactions. Our results reveal a striking asymmetry: Familiarity and Structure show negligible impact, while Disposition produces large, consistent performance degradation for immoral characters across all conditions. Further analyses suggest that the Moral--Immoral gap is amplified by post-SFT alignment, and that this degradation varies substantially across profile attributes. To mitigate this bottleneck, we propose Field-Aware Contrastive Decoding (FACD), a training-free strategy that amplifies suppressed disposition-sensitive signals, significantly closing the performance gap without sacrificing moral-character performance.

Abstract

Related papers