GPT4 is Slightly Helpful for Peer-Review Assistance: A Pilot Study
Abstract
In this pilot study, we investigate the use of GPT4 to assist in the peer-review process. Our key hypothesis was that GPT-generated reviews could achieve comparable helpfulness to human reviewers. By comparing reviews generated by both human reviewers and GPT models for academic papers submitted to a major machine learning conference, we provide initial evidence that artificial intelligence can contribute effectively to the peer-review process. We also perform robustness experiments with inserted errors to understand which parts of the paper the model tends to focus on. Our findings open new avenues for leveraging machine learning tools to address resource constraints in peer review. The results also shed light on potential enhancements to the review process and lay the groundwork for further research on scaling oversight in a domain where human-feedback is increasingly a scarce resource.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2023
- DOI:
- 10.48550/arXiv.2307.05492
- arXiv:
- arXiv:2307.05492
- Bibcode:
- 2023arXiv230705492R
- Keywords:
-
- Computer Science - Human-Computer Interaction;
- Computer Science - Artificial Intelligence;
- Computer Science - Computation and Language
- E-Print:
- 15 pages