Toward a New Protocol to Evaluate Recommender Systems
Abstract
In this paper, we propose an approach to analyze the performance and the added value of automatic recommender systems in an industrial context. We show that recommender systems are multifaceted and can be organized around 4 structuring functions: help users to decide, help users to compare, help users to discover, help users to explore. A global off line protocol is then proposed to evaluate recommender systems. This protocol is based on the definition of appropriate evaluation measures for each aforementioned function. The evaluation protocol is discussed from the perspective of the usefulness and trust of the recommendation. A new measure called Average Measure of Impact is introduced. This measure evaluates the impact of the personalized recommendation. We experiment with two classical methods, K-Nearest Neighbors (KNN) and Matrix Factorization (MF), using the well known dataset: Netflix. A segmentation of both users and items is proposed to finely analyze where the algorithms perform well or badly. We show that the performance is strongly dependent on the segments and that there is no clear correlation between the RMSE and the quality of the recommendation.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2012
- DOI:
- 10.48550/arXiv.1209.1983
- arXiv:
- arXiv:1209.1983
- Bibcode:
- 2012arXiv1209.1983M
- Keywords:
-
- Computer Science - Information Retrieval;
- Computer Science - Performance;
- H.3.3;
- H.3.4
- E-Print:
- 6 pages. arXiv admin note: text overlap with arXiv:1203.4487