Zero-shot Factual Consistency Evaluation Across Domains

doi:10.48550/arXiv.2408.04114

Zero-shot Factual Consistency Evaluation Across Domains

Agarwal, Raunak

This work addresses the challenge of factual consistency in text generation systems. We unify the tasks of Natural Language Inference, Summarization Evaluation, Factuality Verification and Factual Consistency Evaluation to train models capable of evaluating the factual consistency of source-target pairs across diverse domains. We rigorously evaluate these against eight baselines on a comprehensive benchmark suite comprising 22 datasets that span various tasks, domains, and document lengths. Results demonstrate that our method achieves state-of-the-art performance on this heterogeneous benchmark while addressing efficiency concerns and attaining cross-domain generalization.

Publication:

arXiv e-prints

Pub Date:

August 2024

DOI:

10.48550/arXiv.2408.04114

arXiv:

arXiv:2408.04114

Bibcode:

2024arXiv240804114A

Keywords:

Computer Science - Computation and Language;
Computer Science - Machine Learning

NASA/ADS

Zero-shot Factual Consistency Evaluation Across Domains

Abstract