On the impossibility of constructing good population mean estimators in a realistic Respondent Driven Sampling model
Abstract
Current methods for population mean estimation from data collected by Respondent Driven Sampling (RDS) are based on the Horvitz-Thompson estimator together with a set of assumptions on the sampling model under which the inclusion probabilities can be determined from the information contained in the data. In this paper, we argue that such set of assumptions are too simplistic to be realistic and that under realistic sampling models, the situation is far more complicated. Specifically, we study a realistic RDS sampling model that is motivated by a real world RDS dataset. We show that, for this model, the inclusion probabilities, which are necessary for the application of the Horvitz-Thompson estimator, can not be determined by the information in the sample alone. An implication is that, unless additional information about the underlying population network is obtained, it is hopeless to conceive of a general theory of population mean estimation from current RDS data.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2012
- DOI:
- 10.48550/arXiv.1209.2072
- arXiv:
- arXiv:1209.2072
- Bibcode:
- 2012arXiv1209.2072G
- Keywords:
-
- Statistics - Methodology;
- Statistics - Applications
- E-Print:
- 13 pages, 2 figures