Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations
Abstract
Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios. In this work, we present a novel method that operates on the hidden representations of a PLM to reduce overfitting. During fine-tuning, our method inserts random autoencoders between the hidden layers of a PLM, which transform activations from the previous layers into multi-view compressed representations before feeding them into the upper layers. The autoencoders are plugged out after fine-tuning, so our method does not add extra parameters or increase computation cost during inference. Our method demonstrates promising performance improvement across a wide range of sequence- and token-level low-resource NLP tasks.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2022
- DOI:
- arXiv:
- arXiv:2211.08794
- Bibcode:
- 2022arXiv221108794L
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning
- E-Print:
- Accepted by ACL 2023