NearOptimal SQ Lower Bounds for Agnostically Learning Halfspaces and ReLUs under Gaussian Marginals
Abstract
We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals. In the former problem, given labeled examples $(\mathbf{x}, y)$ from an unknown distribution on $\mathbb{R}^d \times \{ \pm 1\}$, whose marginal distribution on $\mathbf{x}$ is the standard Gaussian and the labels $y$ can be arbitrary, the goal is to output a hypothesis with 01 loss $\mathrm{OPT}+\epsilon$, where $\mathrm{OPT}$ is the 01 loss of the bestfitting halfspace. In the latter problem, given labeled examples $(\mathbf{x}, y)$ from an unknown distribution on $\mathbb{R}^d \times \mathbb{R}$, whose marginal distribution on $\mathbf{x}$ is the standard Gaussian and the labels $y$ can be arbitrary, the goal is to output a hypothesis with square loss $\mathrm{OPT}+\epsilon$, where $\mathrm{OPT}$ is the square loss of the bestfitting ReLU. We prove Statistical Query (SQ) lower bounds of $d^{\mathrm{poly}(1/\epsilon)}$ for both of these problems. Our SQ lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.
 Publication:

arXiv eprints
 Pub Date:
 June 2020
 arXiv:
 arXiv:2006.16200
 Bibcode:
 2020arXiv200616200D
 Keywords:

 Computer Science  Machine Learning;
 Computer Science  Data Structures and Algorithms;
 Mathematics  Statistics Theory;
 Statistics  Machine Learning
 EPrint:
 19 pages