Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences

doi:10.48550/arXiv.2205.07603

Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences

The increase in performance in NLP due to the prevalence of distributional models and deep learning has brought with it a reciprocal decrease in interpretability. This has spurred a focus on what neural networks learn about natural language with less of a focus on how. Some work has focused on the data used to develop data-driven models, but typically this line of work aims to highlight issues with the data, e.g. highlighting and offsetting harmful biases. This work contributes to the relatively untrodden path of what is required in data for models to capture meaningful representations of natural language. This entails evaluating how well English and Spanish semantic spaces capture a particular type of relational knowledge, namely the traits associated with concepts (e.g. bananas-yellow), and exploring the role of co-occurrences in this context.

Publication:

arXiv e-prints

Pub Date:

May 2022

DOI:

10.48550/arXiv.2205.07603

arXiv:

arXiv:2205.07603

Bibcode:

2022arXiv220507603A

Keywords:

Computer Science - Computation and Language;
Computer Science - Artificial Intelligence

E-Print:

Due to appear in the proceedings of *SEM 2022: The 11th Joint Conference on Lexical and Computational Semantics

NASA/ADS

Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences

Abstract