Entailment as Robust Self-Learner
Abstract
Entailment has been recognized as an important metric for evaluating natural language understanding (NLU) models, and recent studies have found that entailment pretraining benefits weakly supervised fine-tuning. In this work, we design a prompting strategy that formulates a number of different NLU tasks as contextual entailment. This approach improves the zero-shot adaptation of pretrained entailment models. Secondly, we notice that self-training entailment-based models with unlabeled data can significantly improve the adaptation performance on downstream tasks. To achieve more stable improvement, we propose the Simple Pseudo-Label Editing (SimPLE) algorithm for better pseudo-labeling quality in self-training. We also found that both pretrained entailment-based models and the self-trained models are robust against adversarial evaluation data. Experiments on binary and multi-class classification tasks show that SimPLE leads to more robust self-training results, indicating that the self-trained entailment models are more efficient and trustworthy than large language models on language understanding tasks.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2023
- DOI:
- 10.48550/arXiv.2305.17197
- arXiv:
- arXiv:2305.17197
- Bibcode:
- 2023arXiv230517197G
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- Accepted by ACL 2023 main conference