RuleBert: Teaching Soft Rules to Pre-trained Language Models

doi:10.48550/arXiv.2109.13006

RuleBert: Teaching Soft Rules to Pre-trained Language Models

While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge. In fact, even if information is available in the form of approximate (soft) logical rules, it is not clear how to transfer it to a PLM in order to improve its performance for deductive reasoning tasks. Here, we aim to bridge this gap by teaching PLMs how to reason with soft Horn rules. We introduce a classification task where, given facts and soft rules, the PLM should return a prediction with a probability for a given hypothesis. We release the first dataset for this task, and we propose a revised loss function that enables the PLM to learn how to predict precise probabilities for the task. Our evaluation results show that the resulting fine-tuned models achieve very high performance, even on logical rules that were unseen at training. Moreover, we demonstrate that logical notions expressed by the rules are transferred to the fine-tuned model, yielding state-of-the-art results on external datasets.

Publication:

arXiv e-prints

Pub Date:

September 2021

DOI:

10.48550/arXiv.2109.13006

arXiv:

arXiv:2109.13006

Bibcode:

2021arXiv210913006S

Keywords:

Computer Science - Artificial Intelligence;
Computer Science - Computation and Language;
Computer Science - Machine Learning;
Computer Science - Logic in Computer Science;
Computer Science - Neural and Evolutionary Computing;
68T50;
F.2.2;
I.2.7

E-Print:

Logical reasoning, soft Horn rules, Transformers, pre-trained language models, combining symbolic and probabilistic methods, BERT

NASA/ADS

RuleBert: Teaching Soft Rules to Pre-trained Language Models

Abstract