Towards Lower Bounds on the Depth of ReLU Neural Networks

doi:10.48550/arXiv.2105.14835

Towards Lower Bounds on the Depth of ReLU Neural Networks

We contribute to a better understanding of the class of functions that can be represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning any function. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). As a by-product of our investigations, we settle an old conjecture about piecewise linear functions by Wang and Sun (2005) in the affirmative. We also present upper bounds on the sizes of neural networks required to represent functions with logarithmic depth.

Publication:

arXiv e-prints

Pub Date:

May 2021

DOI:

10.48550/arXiv.2105.14835

arXiv:

arXiv:2105.14835

Bibcode:

2021arXiv210514835H

Keywords:

Computer Science - Machine Learning;
Computer Science - Discrete Mathematics;
Computer Science - Neural and Evolutionary Computing;
Mathematics - Combinatorics;
Statistics - Machine Learning

E-Print:

Authors' accepted manuscript for SIAM Journal on Discrete Mathematics. A preliminary conference version appeared at NeurIPS 2021

NASA/ADS

Towards Lower Bounds on the Depth of ReLU Neural Networks

Abstract