On minimal representations of shallow ReLU networks

doi:10.48550/arXiv.2108.05643

On minimal representations of shallow ReLU networks

The realization function of a shallow ReLU network is a continuous and piecewise affine function $f:\mathbb R^d\to \mathbb R$, where the domain $\mathbb R^{d}$ is partitioned by a set of $n$ hyperplanes into cells on which $f$ is affine. We show that the minimal representation for $f$ uses either $n$, $n+1$ or $n+2$ neurons and we characterize each of the three cases. In the particular case, where the input layer is one-dimensional, minimal representations always use at most $n+1$ neurons but in all higher dimensional settings there are functions for which $n+2$ neurons are needed. Then we show that the set of minimal networks representing $f$ forms a $C^\infty$-submanifold $M$ and we derive the dimension and the number of connected components of $M$. Additionally, we give a criterion for the hyperplanes that guarantees that all continuous, piecewise affine functions are realization functions of appropriate ReLU networks.

Publication:

arXiv e-prints

Pub Date:

August 2021

DOI:

10.48550/arXiv.2108.05643

arXiv:

arXiv:2108.05643

Bibcode:

2021arXiv210805643D

Keywords:

Computer Science - Machine Learning;
Mathematics - Statistics Theory;
Primary 68T05;
Secondary 68T07;
26B40

E-Print:

16 pages

NASA/ADS

On minimal representations of shallow ReLU networks

Abstract