A key open question in quantum computation is what advantages quantum neural networks (QNNs) may have over classical neural networks (NNs), and in what situations these advantages may transpire. Here we address this question by studying the memory capacity $C$ of QNNs, which is a metric of the expressive power of a QNN that we have adapted from classical NN theory. We present a capacity inequality showing that the capacity of a QNN is bounded by the information $W$ that can be trained into its parameters: $C \leq W$. One consequence of this bound is that QNNs that are parameterized classically do not show an advantage in capacity over classical NNs having an equal number of parameters. However, QNNs that are parametrized with quantum states could have exponentially larger capacities. We illustrate our theoretical results with numerical experiments by simulating a particular QNN based on a Gaussian Boson Sampler. We also study the influence of sampling due to wavefunction collapse during operation of the QNN, and provide an analytical expression connecting the capacity to the number of times the quantum system is measured.