Stably unactivated neurons in ReLU neural networks

doi:10.48550/arXiv.2412.06829

Stably unactivated neurons in ReLU neural networks

The choice of architecture of a neural network influences which functions will be realizable by that neural network and, as a result, studying the expressiveness of a chosen architecture has received much attention. In ReLU neural networks, the presence of stably unactivated neurons can reduce the network's expressiveness. In this work, we investigate the probability of a neuron in the second hidden layer of such neural networks being stably unactivated when the weights and biases are initialized from symmetric probability distributions. For networks with input dimension $n_0$, we prove that if the first hidden layer has $n_0+1$ neurons then this probability is exactly $\frac{2^{n_0}+1}{4^{n_0+1}}$, and if the first hidden layer has $n_1$ neurons, $n_1 \le n_0$, then the probability is $\frac{1}{2^{n_1+1}}$. Finally, for the case when the first hidden layer has more neurons than $n_0+1$, a conjecture is proposed along with the rationale. Computational evidence is presented to support the conjecture.

Publication:

arXiv e-prints

Pub Date:

December 2024

DOI:

10.48550/arXiv.2412.06829

arXiv:

arXiv:2412.06829

Bibcode:

2024arXiv241206829B

Keywords:

Computer Science - Machine Learning;
Mathematics - Probability;
Statistics - Machine Learning

ADS

Stably unactivated neurons in ReLU neural networks

Abstract