Generalization error bounds for learning to rank: Does the length of document lists matter?
Abstract
We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking. Existing generalization error bounds necessarily degrade as the size of the document list associated with a query increases. We show that such a degradation is not intrinsic to the problem. For several loss functions, including the cross-entropy loss used in the well known ListNet method, there is \emph{no} degradation in generalization ability as document lists become longer. We also provide novel generalization error bounds under $\ell_1$ regularization and faster convergence rates if the loss function is smooth.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2016
- DOI:
- 10.48550/arXiv.1603.01860
- arXiv:
- arXiv:1603.01860
- Bibcode:
- 2016arXiv160301860T
- Keywords:
-
- Computer Science - Machine Learning
- E-Print:
- Appeared in ICML 2015. arXiv admin note: substantial text overlap with arXiv:1405.0586