← all papers Β· overview

SMolLM: Small Language Models Learn Small Molecular Grammar

Abstract

arXiv:2605.06322v2 Announce Type: replace Abstract: Language models for molecular design have scaled to hundreds of millions of parameters, yet how they learn chemical grammar is poorly understood. We train SMolLM, a 53K-parameter weight-shared transformer, to generate novel SMILES with 95% validity on the ZINC-250K drug-like-molecule benchmark, outperforming a standard GPT with 10 times more parameters. Mechanistically, the same block resolves SMILES constraints across passes in a fixed hierarchy: brackets first, rings second, and valence last, as shown by error classification and linear probing, with ablation isolating the bracket-matching head. Together, these results yield a compact, mechanistically interpretable molecular generator and a testbed for studying iterative computation in formal-language domains.

Related papers