Good-Enough Example Extrapolation
Abstract
This paper asks whether extrapolating the hidden space distribution of text examples from one class onto another is a valid inductive bias for data augmentation. To operationalize this question, I propose a simple data augmentation protocol called "good-enough example extrapolation" (GE3). GE3 is lightweight and has no hyperparameters. Applied to three text classification datasets for various data imbalance scenarios, GE3 improves performance more than upsampling and other hidden-space data augmentation methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2021
- DOI:
- arXiv:
- arXiv:2109.05602
- Bibcode:
- 2021arXiv210905602W
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- Camera-ready for EMNLP 2021 main conference. V2 is corrected with SMOTE citation and model setup language is clarified