Neural Data Augmentation via Example Extrapolation
Abstract
In many applications of machine learning, certain categories of examples may be underrepresented in the training data, causing systems to underperform on such "few-shot" cases at test time. A common remedy is to perform data augmentation, such as by duplicating underrepresented examples, or heuristically synthesizing new examples. But these remedies often fail to cover the full diversity and complexity of real examples. We propose a data augmentation approach that performs neural Example Extrapolation (Ex2). Given a handful of exemplars sampled from some distribution, Ex2 synthesizes new examples that also belong to the same distribution. The Ex2 model is learned by simulating the example generation procedure on data-rich slices of the data, and it is applied to underrepresented, few-shot slices. We apply Ex2 to a range of language understanding tasks and significantly improve over state-of-the-art methods on multiple few-shot learning benchmarks, including for relation extraction (FewRel) and intent classification + slot filling (SNIPS).
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2021
- DOI:
- 10.48550/arXiv.2102.01335
- arXiv:
- arXiv:2102.01335
- Bibcode:
- 2021arXiv210201335L
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence