Inner Product Similarity Search using Compositional Codes

doi:10.48550/arXiv.1406.4966

Inner Product Similarity Search using Compositional Codes

This paper addresses the nearest neighbor search problem under inner product similarity and introduces a compact code-based approach. The idea is to approximate a vector using the composition of several elements selected from a source dictionary and to represent this vector by a short code composed of the indices of the selected elements. The inner product between a query vector and a database vector is efficiently estimated from the query vector and the short code of the database vector. We show the superior performance of the proposed group $M$-selection algorithm that selects $M$ elements from $M$ source dictionaries for vector approximation in terms of search accuracy and efficiency for compact codes of the same length via theoretical and empirical analysis. Experimental results on large-scale datasets ($1M$ and $1B$ SIFT features, $1M$ linear models and Netflix) demonstrate the superiority of the proposed approach.

Publication:

arXiv e-prints

Pub Date:

June 2014

DOI:

10.48550/arXiv.1406.4966

arXiv:

arXiv:1406.4966

Bibcode:

2014arXiv1406.4966D

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Machine Learning;
Statistics - Machine Learning

E-Print:

The approach presented in this paper (ECCV14 submission) is closely related to multi-stage vector quantization and residual quantization. Thanks the reviewers (CVPR14 and ECCV14) for pointing out the relationship to the two algorithms. Related paper: http://sites.skoltech.ru/app/data/uploads/sites/2/2013/09/CVPR14.pdf, which also adopts the summation of vectors for vector approximation

NASA/ADS

Inner Product Similarity Search using Compositional Codes

Abstract