Proper Agnostic Learning of Functions of Halfspaces under Gaussian Marginals

Abstract

arXiv:2605.27594v1 Announce Type: cross Abstract: We study the problem of computationally efficient proper agnostic learning of multidimensional concept classes under the Gaussian distribution. In this setting, given i.i.d. labeled samples from an unknown distribution over $\mathbb{R}^d \times \{\pm 1\}$ whose marginal on $\mathbb{R}^d$ is Gaussian, the goal is to output a hypothesis from a target class $\mathcal{F}$ whose 0-1 loss is within $\epsilon$ of that of the best classifier in $\mathcal{F}$. We give the first efficient proper agnostic learning algorithm for arbitrary Boolean functions of $K$ halfspaces under Gaussian marginals. Our algorithm runs in time $d^{O(K^2 \log(1/\epsilon)/\epsilon^2)} + (K/\epsilon)^{O(K^3/\epsilon^{2.5})}$. Prior to our work, the only known algorithm for $K \geq 2$ was brute-force search, with run-time exponential in $d$. Moreover, the dependence of our run-time on the dimension $d$ matches that of the best known improper learning algorithm, namely $d^{\widetilde{O}(K^2/\epsilon^2)}$. For the special case of a single halfspace ($K=1$), the best previous run-time was $d^{O(1/\epsilon^4)} + (1/\epsilon)^{O(1/\epsilon^6)}$. Our algorithm improves this to $d^{O(1/\epsilon^2)} + (1/\epsilon)^{O(1/\epsilon^{2.5})}$. Once again, the dependence on $d$ matches that of the best known improper algorithm, namely $d^{O(1/\epsilon^2)}$. Furthermore, the dependence of our run-time on the dimension $d$ is essentially optimal in the statistical query model.

Abstract

Related papers