Generalization error bounds for learning to rank: Does the length of document lists matter?

doi:10.48550/arXiv.1603.01860

Generalization error bounds for learning to rank: Does the length of document lists matter?

We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking. Existing generalization error bounds necessarily degrade as the size of the document list associated with a query increases. We show that such a degradation is not intrinsic to the problem. For several loss functions, including the cross-entropy loss used in the well known ListNet method, there is \emph{no} degradation in generalization ability as document lists become longer. We also provide novel generalization error bounds under $\ell_1$ regularization and faster convergence rates if the loss function is smooth.

Publication:

arXiv e-prints

Pub Date:

March 2016

DOI:

10.48550/arXiv.1603.01860

arXiv:

arXiv:1603.01860

Bibcode:

2016arXiv160301860T

Keywords:

Computer Science - Machine Learning

E-Print:

Appeared in ICML 2015. arXiv admin note: substantial text overlap with arXiv:1405.0586

NASA/ADS

Generalization error bounds for learning to rank: Does the length of document lists matter?

Abstract