TorchLean: Formalizing Neural Networks in Lean

Abstract

arXiv:2602.22631v2 Announce Type: replace-cross Abstract: Neural networks are increasingly deployed in scientific, safety critical, and mission critical pipelines, yet verification and analysis are often performed outside the programming environment that defines and runs the model. This creates a semantic gap between the executed network and the analyzed artifact: guarantees can depend on implicit conventions about operator semantics, tensor layouts, preprocessing, floating-point behavior, graph transformations, accelerated kernels, and external certificates. We present TorchLean, a unified framework for formalizing, executing, and verifying neural networks in Lean 4. TorchLean treats learned models as executable programs and mathematical objects with a shared semantics for computation, verification, and theorem proving. The framework provides a PyTorch style API for typed tensors, layers, objectives, optimizers, automatic differentiation, and graph programs, with eager and compiled execution paths that lower to a common computation-graph representation. TorchLean supports exact and finite-precision tensor semantics, verified reverse-mode differentiation, interval and affine bound propagation, CROWN/LiRPA style certificate checking, import/export workflows, and CUDA-backed execution through explicit FFI boundaries. It also includes semantic layers for attention and FlashAttention, state-space sequence models, diffusion and sampling processes, probability kernels, reinforcement-learning objectives and Markov decision processes, and self-supervised objectives such as masked autoencoding, JEPA-style predictive views, and variance/correlation-based anti-collapse losses. Together, these components provide a semantic foundation for verified machine learning, where executable neural network artifacts, verification procedures, runtime boundaries, and mathematical claims can be stated and related inside one theorem-proving environment.

Abstract

Related papers