Reconstruction of Word Embeddings from Sub-Word Parameters
Abstract
Pre-trained word embeddings improve the performance of a neural model at the cost of increasing the model size. We propose to benefit from this resource without paying the cost by operating strictly at the sub-lexical level. Our approach is quite simple: before task-specific training, we first optimize sub-word parameters to reconstruct pre-trained word embeddings using various distance measures. We report interesting results on a variety of tasks: word similarity, word analogy, and part-of-speech tagging.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2017
- DOI:
- 10.48550/arXiv.1707.06957
- arXiv:
- arXiv:1707.06957
- Bibcode:
- 2017arXiv170706957S
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- EMNLP 2017, Workshop on Subword and Character Level Models in NLP