Inner Product Similarity Search using Compositional Codes
Abstract
This paper addresses the nearest neighbor search problem under inner product similarity and introduces a compact code-based approach. The idea is to approximate a vector using the composition of several elements selected from a source dictionary and to represent this vector by a short code composed of the indices of the selected elements. The inner product between a query vector and a database vector is efficiently estimated from the query vector and the short code of the database vector. We show the superior performance of the proposed group $M$-selection algorithm that selects $M$ elements from $M$ source dictionaries for vector approximation in terms of search accuracy and efficiency for compact codes of the same length via theoretical and empirical analysis. Experimental results on large-scale datasets ($1M$ and $1B$ SIFT features, $1M$ linear models and Netflix) demonstrate the superiority of the proposed approach.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2014
- DOI:
- 10.48550/arXiv.1406.4966
- arXiv:
- arXiv:1406.4966
- Bibcode:
- 2014arXiv1406.4966D
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning;
- Statistics - Machine Learning
- E-Print:
- The approach presented in this paper (ECCV14 submission) is closely related to multi-stage vector quantization and residual quantization. Thanks the reviewers (CVPR14 and ECCV14) for pointing out the relationship to the two algorithms. Related paper: http://sites.skoltech.ru/app/data/uploads/sites/2/2013/09/CVPR14.pdf, which also adopts the summation of vectors for vector approximation