Statistical theory for image classification using deep convolutional neural networks with crossentropy loss
Abstract
Convolutional neural networks learned by minimizing the crossentropy loss are nowadays the standard for image classification. Till now, the statistical theory behind those networks is lacking. We analyze the rate of convergence of the misclassification risk of the estimates towards the optimal misclassification risk. Under suitable assumptions on the smoothness and structure of the aposteriori probability it is shown that these estimates achieve a rate of convergence which is independent of the dimension of the image. The study shed light on the good performance of CNNs learned by crossentropy loss and partly explains their success in practical applications.
 Publication:

arXiv eprints
 Pub Date:
 November 2020
 arXiv:
 arXiv:2011.13602
 Bibcode:
 2020arXiv201113602K
 Keywords:

 Mathematics  Statistics Theory
 EPrint:
 arXiv admin note: text overlap with arXiv:2003.01526