Following Length Constraints in Instructions

doi:10.48550/arXiv.2406.17744

Following Length Constraints in Instructions

Aligned instruction following models can better fulfill user requests than their unaligned counterparts. However, it has been shown that there is a length bias in evaluation of such models, and that training algorithms tend to exploit this bias by learning longer responses. In this work we show how to train models that can be controlled at inference time with instructions containing desired length constraints. Such models are superior in length instructed evaluations, outperforming standard instruction following models such as GPT4, Llama 3 and Mixtral.

Publication:

arXiv e-prints

Pub Date:

June 2024

DOI:

10.48550/arXiv.2406.17744

arXiv:

arXiv:2406.17744

Bibcode:

2024arXiv240617744Y

Keywords:

Computer Science - Computation and Language

E-Print:

13 pages

NASA/ADS

Following Length Constraints in Instructions

Abstract