Feature Resemblance: Towards a Theoretical Understanding of Analogical Reasoning in Transformers

Abstract

arXiv:2603.05143v3 Announce Type: replace-cross Abstract: Understanding reasoning in large language models is complicated by evaluations that conflate multiple reasoning types. We isolate analogical reasoning, where a model transfers an attribute between entities that share known properties, and study when such transfer can emerge from training. To make the problem analytically tractable, we study a minimal transformer-style abstraction that isolates how learned representations support analogical reasoning. Within this setting, we prove three key results. First, joint training on similarity and attribution premises enables analogical reasoning through aligned representations. Second, sequential training succeeds only when similarity structure is learned before specific attributes, revealing a curriculum asymmetry. Third, in our stylized setting, two-hop reasoning $(a \to b, b \to c \Rightarrow a \to c)$ can be viewed as analogical reasoning with identity bridges $(b=b)$, which appear explicitly in training data. Together, these results reveal a unified mechanism: entities with shared properties become aligned in representation space, enabling property transfer through feature resemblance. Experiments with architectures up to 8B parameters show qualitative agreement with the theory and suggest that representational geometry plays an important role in analogical reasoning beyond the stylized model.

Abstract

Related papers