Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

doi:10.48550/arXiv.2404.11960

Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results. However, these rankers are hindered by two major drawbacks: (1) they fail to follow a standardized comparison guidance during the ranking process, and (2) they struggle with comprehensive considerations when dealing with complicated passages. To address these shortcomings, we propose to build a ranker that generates ranking scores based on a set of criteria from various perspectives. These criteria are intended to direct each perspective in providing a distinct yet synergistic evaluation. Our research, which examines eight datasets from the BEIR benchmark demonstrates that incorporating this multi-perspective criteria ensemble approach markedly enhanced the performance of pointwise LLM rankers.

Publication:

arXiv e-prints

Pub Date:

April 2024

DOI:

10.48550/arXiv.2404.11960

arXiv:

arXiv:2404.11960

Bibcode:

2024arXiv240411960G

Keywords:

Computer Science - Information Retrieval;
Computer Science - Artificial Intelligence

NASA/ADS

Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

Abstract