Probing Social Identity Bias in Chinese LLMs with Gendered Pronouns and Social Groups

Abstract

arXiv:2510.06974v2 Announce Type: replace Abstract: Large language models (LLMs) are increasingly deployed in user-facing applications, raising concerns that they may reflect and amplify social biases. We investigate social identity biases in Chinese LLMs using Mandarin-specific prompts across ten representative models. Our evaluation compares ingroup ("We") and outgroup ("They") framings across 240 social groups salient in the Chinese context, using a two-tiered measurement framework that assesses both sentiment and toxicity. The prompt design explicitly accounts for linguistic properties of Mandarin, including the distinction between the default gender-neutral plural pronoun and its explicitly feminine counterpart, enabling a controlled comparison of social identity framing effects. Across models, we observe systematic ingroup-outgroup asymmetries, although their expression differs across measurement dimensions. In particular, instruction tuning often reduces sentiment asymmetries, while toxicity gaps remain more persistent. Moreover, the feminine-marked plural pronoun is associated with higher toxicity than the default gender-neutral plural in several models. Our study introduces a language-aware evaluation framework for Chinese LLMs and shows that (i) social identity biases previously documented in English also manifest in Chinese and that (ii) Mandarin-specific linguistic structure can reveal bias patterns that are not directly observable in English-only settings.

Abstract

Related papers