When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models
Abstract
We address hypernymy detection, i.e., whether an is-a relationship exists between words (x, y), with the help of large textual corpora. Most conventional approaches to this task have been categorized to be either pattern-based or distributional. Recent studies suggest that pattern-based ones are superior, if large-scale Hearst pairs are extracted and fed, with the sparsity of unseen (x, y) pairs relieved. However, they become invalid in some specific sparsity cases, where x or y is not involved in any pattern. For the first time, this paper quantifies the non-negligible existence of those specific cases. We also demonstrate that distributional methods are ideal to make up for pattern-based ones in such cases. We devise a complementary framework, under which a pattern-based and a distributional model collaborate seamlessly in cases which they each prefer. On several benchmark datasets, our framework achieves competitive improvements and the case study shows its better interpretability.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2020
- DOI:
- 10.48550/arXiv.2010.04941
- arXiv:
- arXiv:2010.04941
- Bibcode:
- 2020arXiv201004941Y
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- Accepted by EMNLP2020 Main Conference