Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model

Abstract

Empirical studies have revealed low dimensional structures in the eigenspectra of weights, Hessians, gradients, and feature vectors of deep networks, consistently observed across datasets and architectures in the overparameterized regime. In this work, we analyze deep unconstrained feature models (UFMs) to provide an analytic explanation of how these structures emerge at the layerwise level, including the bulk outlier Hessian spectrum and the alignment of gradient descent with the outlier eigenspace. We show that deep neural collapse underlies these phenomena, deriving explicit expressions for eigenvalues and eigenvectors of many deep learning matrices in terms of class feature means. Furthermore, we demonstrate that the full Hessian inherits its low dimensional structure from the layerwise Hessians, and empirically validate our theory in both UFMs and deep networks.

Abstract

Related papers