Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences
Abstract
The increase in performance in NLP due to the prevalence of distributional models and deep learning has brought with it a reciprocal decrease in interpretability. This has spurred a focus on what neural networks learn about natural language with less of a focus on how. Some work has focused on the data used to develop data-driven models, but typically this line of work aims to highlight issues with the data, e.g. highlighting and offsetting harmful biases. This work contributes to the relatively untrodden path of what is required in data for models to capture meaningful representations of natural language. This entails evaluating how well English and Spanish semantic spaces capture a particular type of relational knowledge, namely the traits associated with concepts (e.g. bananas-yellow), and exploring the role of co-occurrences in this context.
- Publication:
-
arXiv e-prints
- Pub Date:
- May 2022
- DOI:
- 10.48550/arXiv.2205.07603
- arXiv:
- arXiv:2205.07603
- Bibcode:
- 2022arXiv220507603A
- Keywords:
-
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence
- E-Print:
- Due to appear in the proceedings of *SEM 2022: The 11th Joint Conference on Lexical and Computational Semantics