A Minimax Optimal Algorithm for Crowdsourcing
Abstract
We consider the problem of accurately estimating the reliability of workers based on noisy labels they provide, which is a fundamental question in crowdsourcing. We propose a novel lower bound on the minimax estimation error which applies to any estimation procedure. We further propose Triangular Estimation (TE), an algorithm for estimating the reliability of workers. TE has low complexity, may be implemented in a streaming setting when labels are provided by workers in real time, and does not rely on an iterative procedure. We further prove that TE is minimax optimal and matches our lower bound. We conclude by assessing the performance of TE and other state-of-the-art algorithms on both synthetic and real-world data sets.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2016
- DOI:
- 10.48550/arXiv.1606.00226
- arXiv:
- arXiv:1606.00226
- Bibcode:
- 2016arXiv160600226B
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Human-Computer Interaction;
- Computer Science - Machine Learning;
- Computer Science - Social and Information Networks
- E-Print:
- 19 pages, NIPS 2017