Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers
Abstract
Coreference resolution is an intermediate step for text understanding. It is used in tasks and domains for which we do not necessarily have coreference annotated corpora. Therefore, generalization is of special importance for coreference resolution. However, while recent coreference resolvers have notable improvements on the CoNLL dataset, they struggle to generalize properly to new domains or datasets. In this paper, we investigate the role of linguistic features in building more generalizable coreference resolvers. We show that generalization improves only slightly by merely using a set of additional linguistic features. However, employing features and subsets of their values that are informative for coreference resolution, considerably improves generalization. Thanks to better generalization, our system achieves state-of-the-art results in out-of-domain evaluations, e.g., on WikiCoref, our system, which is trained on CoNLL, achieves on-par performance with a system designed for this dataset.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2017
- DOI:
- 10.48550/arXiv.1708.00160
- arXiv:
- arXiv:1708.00160
- Bibcode:
- 2017arXiv170800160S
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- EMNLP 2018 long paper