High-dimensional vector semantics
Abstract
In this paper we explore the “vector semantics” problem from the perspective of “almost orthogonal” property of high-dimensional random vectors. We show that this intriguing property can be used to “memorize” random vectors by simply adding them, and we provide an efficient probabilistic solution to the set membership problem. Also, we discuss several applications to word context vector embeddings, document sentences similarity, and spam filtering.
- Publication:
-
International Journal of Modern Physics C
- Pub Date:
- 2018
- DOI:
- 10.1142/S0129183118500158
- arXiv:
- arXiv:1802.09914
- Bibcode:
- 2018IJMPC..2950015A
- Keywords:
-
- Probability and statistics;
- artificial intelligence;
- 02.50.‑r;
- 07.05.Mh;
- Neural networks fuzzy logic artificial intelligence;
- Computer Science - Computation and Language;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- 12 pages, 5 figures, Int. J. Mod. Phys. C, 2018