Optimizing the Wisdom of the Crowd: Inference, Learning, and Teaching
Abstract
The unprecedented demand for large amount of data has catalyzed the trend of combining human insights with machine learning techniques, which facilitate the use of crowdsourcing to enlist label information both effectively and efficiently. The classic work on crowdsourcing mainly focuses on the label inference problem under the categorization setting. However, inferring the true label requires sophisticated aggregation models that usually can only perform well under certain assumptions. Meanwhile, no matter how complicated the aggregation model is, the true model that generated the crowd labels remains unknown. Therefore, the label inference problem can never infer the ground truth perfectly. Based on the fact that the crowdsourcing labels are abundant and utilizing aggregation will lose such kind of rich annotation information (e.g., which worker provided which labels), we believe that it is critical to take the diverse labeling abilities of the crowdsourcing workers as well as their correlations into consideration. To address the above challenge, we propose to tackle three research problems, namely inference, learning, and teaching.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2018
- DOI:
- 10.48550/arXiv.1806.09018
- arXiv:
- arXiv:1806.09018
- Bibcode:
- 2018arXiv180609018Z
- Keywords:
-
- Statistics - Machine Learning;
- Computer Science - Human-Computer Interaction;
- Computer Science - Machine Learning
- E-Print:
- 3 pages, SBP-BRiMS 18 Doctoral Consortium