Identifying Relevant Document Facets for Keyword-Based Search Queries
Abstract
As structured documents with rich metadata (such as products, movies, etc.) become increasingly prevalent, searching those documents has become an important IR problem. Although advanced search interfaces are widely available, most users still prefer to use keyword-based queries to search those documents. Query keywords often imply some hidden restrictions on the desired documents, which can be represented as document facet-value pairs. To achieve high retrieval performance, it's important to be able to identify the relevant facet-value pairs hidden in a query. In this paper, we study the problem of identifying document facet-value pairs that are relevant to a keyword-based search query. We propose a machine learning approach and a set of useful features, and evaluate our approach using a movie data set from INEX.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2015
- DOI:
- 10.48550/arXiv.1501.00744
- arXiv:
- arXiv:1501.00744
- Bibcode:
- 2015arXiv150100744Z
- Keywords:
-
- Computer Science - Information Retrieval