Learning Contextualized Semantics from Co-occurring Terms via a Siamese Architecture

doi:10.48550/arXiv.1506.05514

Learning Contextualized Semantics from Co-occurring Terms via a Siamese Architecture

One of the biggest challenges in Multimedia information retrieval and understanding is to bridge the semantic gap by properly modeling concept semantics in context. The presence of out of vocabulary (OOV) concepts exacerbates this difficulty. To address the semantic gap issues, we formulate a problem on learning contextualized semantics from descriptive terms and propose a novel Siamese architecture to model the contextualized semantics from descriptive terms. By means of pattern aggregation and probabilistic topic models, our Siamese architecture captures contextualized semantics from the co-occurring descriptive terms via unsupervised learning, which leads to a concept embedding space of the terms in context. Furthermore, the co-occurring OOV concepts can be easily represented in the learnt concept embedding space. The main properties of the concept embedding space are demonstrated via visualization. Using various settings in semantic priming, we have carried out a thorough evaluation by comparing our approach to a number of state-of-the-art methods on six annotation corpora in different domains, i.e., MagTag5K, CAL500 and Million Song Dataset in the music domain as well as Corel5K, LabelMe and SUNDatabase in the image domain. Experimental results on semantic priming suggest that our approach outperforms those state-of-the-art methods considerably in various aspects.

Publication:

arXiv e-prints

Pub Date:

June 2015

DOI:

10.48550/arXiv.1506.05514

arXiv:

arXiv:1506.05514

Bibcode:

2015arXiv150605514S

Keywords:

Computer Science - Information Retrieval;
Computer Science - Computation and Language;
Computer Science - Machine Learning;
I.2.6

NASA/ADS

Learning Contextualized Semantics from Co-occurring Terms via a Siamese Architecture

Abstract