Mapping Unseen Words to Task-Trained Embedding Spaces

doi:10.48550/arXiv.1510.02387

Mapping Unseen Words to Task-Trained Embedding Spaces

We consider the supervised training setting in which we learn task-specific word embeddings. We assume that we start with initial embeddings learned from unlabelled data and update them to learn task-specific embeddings for words in the supervised training data. However, for new words in the test set, we must use either their initial embeddings or a single unknown embedding, which often leads to errors. We address this by learning a neural network to map from initial embeddings to the task-specific embedding space, via a multi-loss objective function. The technique is general, but here we demonstrate its use for improved dependency parsing (especially for sentences with out-of-vocabulary words), as well as for downstream improvements on sentiment analysis.

Publication:

arXiv e-prints

Pub Date:

October 2015

DOI:

10.48550/arXiv.1510.02387

arXiv:

arXiv:1510.02387

Bibcode:

2015arXiv151002387S

Keywords:

Computer Science - Computation and Language;
Computer Science - Machine Learning

E-Print:

8 + 3 pages, 3 figures

NASA/ADS

Mapping Unseen Words to Task-Trained Embedding Spaces

Abstract