Abstract
arXiv:2603.00857v2 Announce Type: replace Abstract: MultiPUFFIN is a domain-informed multimodal foundation model for predicting thermophysical properties of small molecules, addressing a critical gap in chemical engineering, drug discovery, and materials science. Existing molecular foundation models pretrain on millions of molecules to learn general-purpose representations, but their standard MLP output layers impose no physical constraints, vapor pressure predictions may violate monotonic temperature dependence, and viscosity curves may lack the functional form required by process simulators. Domain-informed approaches that guarantee thermodynamic consistency have remained limited to single properties and small datasets, whereas multimodal foundation models have focused on biological activity rather than thermophysical properties. MultiPUFFIN fills this gap by fusing SMILES sequences, 2D molecular graphs, and 3D conformer geometries through bidirectional cross-modal attention and gated fusion, supplemented by auxiliary encoders for experimental conditions and molecular descriptors. The backbone is pretrained on 500,000 unlabelled PubChem molecules using three complementary self-supervised objectives. A condition-aware refinement stack of five conditioners (temperature, pH, pressure, polymorph, and measurement method) routes each property to a four-head tournament that selects the best-performing thermodynamically informed head for that property. MultiPUFFIN achieves a mean test R2 of 0.784 and outperforms fine-tuned ChemBERTa-2 on all nine properties despite training on roughly 2,000x fewer labeled molecules.