Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

doi:10.48550/arXiv.2007.06162

Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

It has been a common approach to pre-train a language model on a large corpus and fine-tune it on task-specific data. In practice, we observe that fine-tuning a pre-trained model on a small dataset may lead to over- and/or under-estimation problem. In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to under-estimated ones. Experiments on a variety of text generation datasets show that MC-Tailor consistently and significantly outperforms the fine-tuning approach. Our code is available at this url.

Publication:

arXiv e-prints

Pub Date:

July 2020

DOI:

10.48550/arXiv.2007.06162

arXiv:

arXiv:2007.06162

Bibcode:

2020arXiv200706162M

Keywords:

Computer Science - Computation and Language

E-Print:

Accepted by ACL 2020

NASA/ADS

Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods

Abstract